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Preface 



STAGS, the Symposium on Theoretical Aspects of Computer Science, is held an- 
nually, alternating between France and Germany. STAGS is organized jointly by 
the Special Interest Group for Theoretical Computer Science of the Gesellschaft 
fiir Informatik (GI) in Germany and the Maison de ITnformatique et des Math- 
ematiques Discretes (MIMD) in France. STAGS 2000 was the 17th in the series. It 
was held in Lille from February 17th to 19th, 2000. Previous STAGS symposia 
took place in Paris (1984), Saarbriicken (1985), Orsay (1986), Passau (1987), 
Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan 
(1992), Wurzburg (1993), Caen (1994), Miinchen (1995), Grenoble (1996), Liibeck 
(1997), Paris (1998), and Trier (1999). All STAGS proceedings have been pub- 
lished in the Lecture Notes in Computer Science of Springer- Verlag. 

STAGS has become one of the most important annual meetings in Europe 
for the theoretical computer science community. It covers a wide range of topics 
in the area of foundations of computer science. This time, 146 submissions from 
30 countries were received, all in electronic form. Jochen Bern designed the 
electronic submission procedure, which performed marvelously. Many thanks to 
Jochen. 

The submitted papers address fundamental problems from many areas of 
computer science: algorithms and data structure, automata and formal lan- 
guages, complexity, verification, logic, cryptography. Many tackled new areas, 
including mobile computing and quantum computing. During the program com- 
mittee meeting in Lille, 51 papers were selected for presentation. Most of the 
papers were evaluated by five members of the program committee, partly with 
the assistance of subreferees for a total of 700 reports. We thank the program 
committee for its demanding work in the evaluation process. We also thank all 
the reviewers whose names are listed on the next pages. 

We are specially grateful to the invited speakers Pascal Koiran, Thomas 
Henzinger, and Amin Schokrollahi for accepting our invitation and presenting 
us their insights on their research area. 

We would like to express our sincere gratitude to Anne-Gecile Garon, Remi 
Gilleron, and Marc Tommasi who invested their time and energy to organize this 
conference. Thanks also to all members of the Lahoratoire d’Informatique Fon- 
damentale de Lille, especially to our secretaries, Annie Dancoisne and MichMe 
Driessens. 

The conference was made possible by the financial support of the following 
institutions: European Gommunity, Ministere des Affaires Etrangeres, Ministere 
de I’Education Nationale de la Recherche et de la Technologie, Ministere de la 
Defense, Region Nord/Pas-de-Galais, Ville de Lille, Universite de Lille 1, and 
other organizations. 
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Abstract. In this paper, I will give a brief introduction to the theory 
of low-density parity-check codes, and their decoding. I will emphasize 
the case of correcting erasures as it is still the best understood and most 
accessible case. At the end of the paper, I will also describe more recent 
developments. 



1 Introduction 

In this paper, I want to give a brief introduction to the theory of low-density 
parity-check codes, or LDPC codes, for short. These codes were first introduced 
in the early 1960’s by Gallager in his PhD-thesis 0. They are built using sparse 
bipartite graphs in a manner that we will describe below. As it turns out, an 
analysis of these codes requires tools and methods from graph theory most of 
which were not common knowledge in the early 1960’s. This fact may explain 
to some extent why LDPC codes were almost completely forgotten after their 
invention. As it turned out, major impacts on the theoretical analysis of these 
codes came not from coding theory, but from Theoretical Computer Science, and 
this is why this paper appears in the proceedings of a conference on Theoretical 
Aspects of Computer Science. 

I will deliberately be very brief on the history of LDPC-codes since I would 
like to concentrate more on very recent developments. But no paper on this topic 
would be complete without mentioning the names of Zyablov and Pinsker prirri 
and Margulis Pg from the Russian school who had realized the potential of 
LDPC codes in the 1970’s, and the name of Tanner who re-invented and 
extended LDPC codes. In fact, re-invention seems to be a recurring theme: with 
the advent of the powerful class of Turbo codes P], many researchers started 
to study other types of codes which have fast encoders and decoders and can 
perform at rates very close to theoretical upper bounds derived by Shannon uni 
For instance, MacKay re-invented some versions of LDPC codes and derived 
many interesting and useful properties. His paper m is a must for anybody who 
wants to work in this field as it gives a fresh and detailed look at various aspects 
of LDPC codes. At the same time when coding theorists were struck by the 
performance of Turbo codes and were starting to remember LDPC codes, these 
codes were again re-invented, this time in the Theoretical Computer Science 
community. Graphs are very useful objects in this field (and one may sometimes 
get the impression that everything in this field is either based on, or motivated 
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by, or derived from graphs!). On the other hand, linear codes had been shown 
to be very useful in the construction of Probabilistically Checkable Proofs j2|. 
It seems like a good idea to combine these concepts. Sipser and Spielman did 
exactly this m- One of the (many) very interesting results of this work is that 
it shows how the performance of the codes constructed is directly related to the 
expansion properties of the underlying graph. The Russian School seems to have 
known this too, though Sipser and Spielman were completely unaware of this. 

A very useful and extremely advantageous property of LDPC codes is that 
it is very easy to design various efficient encoders and decoders for them. This 
was already done by Gallager, though largely without a rigorous analysis. His 
algorithms were re-invented, improved, and analyzed later. For instance, Sipser 
and Spielman give a simple encoder and decoder which run in time O(n^) and 
time 0(n) respectively, where n is the block- length of the code (see below for a 
definition of this parameter). The expansion property described above actually 
translates into the error performance of the algorithm, rather than that of the 
abstract code. Spielman UHl applies another idea to decrease the encoding time 
to linear as well, and uses the best known explicit expanders to construct codes 
that not only have very efficient encoders and decoders, but are also “asymp- 
totically good.” I will not describe further what this technical term means, and 
leave it by the remark that construction of asymptotically good codes is very 
difficult in itself even if one does not assume that they are efficiently encodable 
and decodable. 

Motivated by the practical problem of sending packets through high-speed 
computer networks, a group at UC Berkeley and the International Computer Sci- 
ence Institute in Berkeley consisting of Luby, Mitzenmacher, Spielman, Stemann, 
and myself designed a very simple algorithm for correcting erasures for LDPC 
codes Pni- Similar codes had already been constructed by Alon and Luby P, 
but they performed poorly in practice. The work cni had several major im- 
pacts on subsequent work on LDPC codes. First, it contained for the first time 
a rigorous analysis of a probabilistic decoding algorithm for LDPC codes. This 
analysis was later greatly simplified by Luby, Mitzenmacher, and myself [7j , and 
this later method developed into the core of the analysis of LDPC codes under 
other, more complicated error models. Second, the paper im proves that highly 
irregular bipartite graphs perform much better than regular graphs (which were 
the method of choice up to then) if the particular simple decoder of that paper 
is used. The paper goes even further: such codes not only perform better, but if 
the graphs are sampled randomly from the set of graphs with a particular degree 
distribution, then the codes can be used to transmit at rates arbitrarily close to 
the capacity of the erasure channel. In other words, the decoder can recover from 
a portion of the encoding which is arbitrarily close to the lower bound dictated 
by the length of the (uncoded) message. In short, we say that these sequences 
of degree distributions are capacity- achieving. 

The model of erasures is very realistic for applications such as data transfer 
on high-speed networks, but codes are typically used in situations where one does 
not know the positions of the errors. Here the problem is much harder. Based on 
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the approach in m and equipped with the new analysis in [3i Luby, Mitzen- 
macher, Spielman, and myself rigorously analyzed some of Gallager’s original 
“flipping” decoders and invented methods to design appropriate degree distri- 
butions so that the corresponding graphs could recover from as many errors as 
possible jHI- To obtain a better performance, the decoder had to be changed. The 
most powerful efficient decoder known to date is the so-called belief-propagation. 
These were tested on good erasure codes, and the results were reported in 0. The 
hope was that since codes that can decode many erasures are capable of decod- 
ing many errors if the (exponential time) maximum-likelihood decoder is used, 
and since belief propagation is a very good decoder, then good erasure codes 
should perform very good under belief-propagation as well. As it turns out, they 
perform good in experiments, but they do not beat Turbo codes. Moreover, we 
did not have a method to analyze the asymptotic performance of these codes, 
and had to rely on heuristic experiments to judge their quality. 

Such an analysis was derived by Richardson and Urbanke PI by generalizing 
the analysis in (which was itself based on that of |Zj). Based on this analysis 
and using the methods in the pervious papers by Luby et al. enriched with some 
new weapons, Richardson, Urbanke, and myself PI were able to construct codes 
that perform at rates much closer to the capacity than Turbo codes. 

In the rest of the paper I will try to describe some of the details left out in 
the above discussion. I will define most of the objects that we have to work with, 
though sometimes rigor is traded against clarity. 

2 Channels and Codes 

For most of what we will be describing in this paper, the following definition of 
a communication channel will be sufficient: A channel is a finite labeled directed 
bipartite graph between a set A called the code- alphabet and a set B called the 
output- alphabet such that the labels are nonnegative real numbers and satisfy 
the following property: for any element a € A, the sum of the labels of edges 
emanating from a is 1. Semantically, the graph describes a communication chan- 
nel in which elements from A are transmitted and those of B are received. The 
label of an edge from a to 6 is the (conditional) probability of obtaining b given 
that a was transmitted. Examples of Channels are given in Figure 0 

Our aim is to reliably transmit information through an unreliable channel, 
i.e., a channel which can cause errors with nonzero probability. We want, thereby, 
to reduce the error in the communication. A first idea to do this is to transmit 
blocks of symbols rather than individual symbols from A. If the symbols are 
chosen uniformly at random from A (an assumption commonly made), then this 
scheme does not provide more protection than the original one. The main idea 
behind reducing the error is that of adding redundancy. The computation of 
redundant symbol from a message is called encoding. This operation produces 
a codeword from a message word and we assume that it is an injective map 
from the set of message words to the set of codewords. A code is then the set of 
codewords obtained this way. The counter-operation to encoding is something we 
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Fig. 1. (a) The binary erasure channel, and (b) the binary symmetric channel 



call de-encoding. This operation computes from a codeword the original message 
word (usually by forgetting the redundant symbol) . A more important operation 
is that of decoding. It assigns to a word x over the alphabet A a codeword c. 
In applications, x is the corrupted version of a codeword c', and the decoder is 
successful if c = c'. 

It should be intuitively clear that adding redundancy does indeed reduce the 
error. For instance, by repeating each transmitted symbol r times and using 
majority rule for decoding, one can easily reduce the decoding error below any 
constant e by increasing r. This technique, called repetition coding, is used by 
many teachers who repeat their material several times to reach all their students. 

If k symbols of the alphabet are encoded to n symbols, then n is called the 
block-length and the fraction k/n is called the rate of the code. The rate equals the 
fraction of real information symbols in a codeword. The repetition code described 
above has block-length rk where k is the length of the message. Its rate is thus 
I /r, which decreases to zero as r increases to infinity. Since sending information 
over a channel is often times expensive, it is desirable to have sequences of codes 
of constant rate for which the decoding error probability decreases to zero as the 
block-length goes to infinity. 

In a fundamental paper which marks the birth of modern coding and in- 
formation theory. Shannon completely answered questions of this type. He 
showed that if the “maximum-likelihood decoding algorithm” is used (which de- 
codes a word over the alphabet to a codeword of minimal Hamming distance), 
then for a given channel there is a critical rate, called the capacity of the channel, 
such that the decoding error probability for any code of rate larger than that 
approaches 1. Furthermore, he showed using a random coding argument, that 
for any rate below the capacity there are codes of that rate for which the decod- 
ing error probability decreases to zero exponentially fast in the block-length, as 
the latter goes to infinity. Computing the capacity is not an easy task in gen- 
eral. Figure Eshows the capacities of the binary symmetric channel HH and the 
binary erasure channel Q. 

Shannon’s paper answered many old questions and generated many new ones. 
Because of the nature of random coding used in his proofs, the first question 
was about how to explicitly construct the codes promised by that theorem. 
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Fig. 2. The two versions of LDPC codes: (a) Original version, and (b) dual 
version 

The second more serious question was that of efficient decoding of such codes, 
as maximum-likelihood decoding is a very hard task in general. (It was shown 
many years later that a corresponding decision problem is NP-hard |^.) 

Low-density parity check (LDPC) codes, described in the next section, are 
very well suited to (at least partially) answering both of these questions. 

3 LDPC Codes 

3.1 Code Construction 

In the following we will assume that the code-alphabet A is the binary field 
GF(2). Let G be a bipartite graph between n nodes on the right called message 
nodes and r nodes on the right called constraint (or check) nodes. The graph 
gives rise to a code in (at least) two different ways, see Figure 0 in the first 
version (which is Gallager’s original version), the coordinates of a codeword are 
indexed by the message nodes 1, . . . , n of G. A vector {x \, . . . , Xn) is a valid 
codeword if and only if for each constraint node the sum (over GF(2)) of the 
values of its adjacent message nodes is zero. Since each constraint node imposes 
one linear condition on the Xi, the rate of the code is at least (n — r)/n. 

In the second version, the message nodes are indexed by the original message. 
The constraint nodes contain the redundant information: the value of each such 
node is equal to the sum (over GF(2)) of the values of its adjacent message 
nodes. The block-length of this code is n + r, and its rate is n/(n -|- r). 

These two versions look quite similar, but differ fundamentally from a com- 
putational point of view. The encoding time of the second version is proportional 
to the number of edges in the graph G, while it is not clear how to encode the 
first version without solving systems of linear equations. (This needs to be done 
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once for the graph; each encoding afterwards corresponds to a matrix/vector 
multiplication.) If the graph is sparse, the encoding time for the second ver- 
sion is essentially linear in the block-length, while that of the first version is 
essentially quadratic (after a pre-processing step). 

While the second version is advantageous for the encoding, the first version 
is more suited to decoding. I don’t want to go into further details on this issue, 
and will in the following only consider Gallager’s original version of LDPC codes. 
Readers are invited to consult 11811(1 to learn more about the second version. 

3.2 Decoding on the Erasure Channel 

As was mentioned earlier, the principal motivation behind our work on erasure 
codes was the design of forward error-correction schemes in high-speed computer 
networks. When data is sent over such a network, it is divided into packets. Each 
packet has an identifier which uniquely describes the entity it comes from and its 
location within that entity. Packets are then routed through the network from 
a sender to a recipient. Often, certain packets do not arrive their destination; in 
certain protocols like the TCP/IP the recipient requests in this case a retrans- 
mission of the packets that have not arrived, upon which the sender initiates 
the retransmission. These steps are iterated several times until the receiver has 
obtained the complete data. This protocol is excellent in certain cases, but is 
very poor in scenarios in which feedback channels do not exist (satellite links), or 
when one sender has to serve a large number of recipients (multicast) . The chan- 
nel corresponding to this scenario is very well modeled by an erasure channel, 
and corresponding codes can be used to remedy the mentioned shortcomings. 

A linear code over a field F of block-length n and dimension k is a, k- 
dimensional subspace of the standard vector space IF” . The minimal Hamming 
weight of a nonzero element in a linear code is called the minimum distance of 
the code, usually denoted by d. It is not hard to see jS| that a linear code of 
minimum distance d is capable of correcting any pattern of d — 1 or less erasures, 
essentially by solving a system of linear equations of size 0{d) over F. Further, 
Elias showed that random linear codes achieve capacity of the erasure channel 
with high probability. The running time of O(d^) of the decoder is, however, 
very slow for our applications in which d has to be very large (in the 100,000’s). 

The decoder that we use for the LDPC codes is extremely simple; we will 
describe it in the case of the binary erasure channel in the following. The de- 
coder maintains a register for each of the message and constraint nodes. All of 
these registers are initially set to zero. In the first round of the decoding, the 
value of each received message node is added to the values of all of its adjacent 
constraint nodes, and then the message nodes and all the edges emanating from 
it are deleted. Once this direct recovery step is complete, the second substitution 
recovery phase kicks in. Here, one looks for a constraint node of degree one. 
Note that since the value of a constraint node in an intact codeword should be 
zero, a constraint node of degree one contains the value of its unique adjacent 
message node. This value is copied into the corresponding message nodes, that 
value is added to those of all its adjacent constraint nodes, and the message node 
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together with all edges emanating from it are deleted from the graph. If there 
are no nodes left, or if there are no constraint nodes of degree one left, then 
the decoder stops. Note that the decoding time is proportional to the number 
of edges in the graph. If the graph is sparse, i.e., if the number edges is linear in 
the number of nodes, then the decoder is linear time (at least on a RAM with 
unit cost measure). 

The hope is that there is always enough supply of degree one constraint nodes 
so that the decoder finishes successfully. Whether or not this is the case depends 
on the original fraction of erasures and on the graph. Surprisingly, however, the 
only important parameter of the underlying graph is the distribution of nodes 
of various degrees. This analysis is the topic of the next section. 

3.3 The Analysis 

To describe the conditions for successful decoding concisely, we need one further 
piece of notation. We call an edge in the graph G of left (right) degree i if it 
is connected to a message (constraint) node of degree i. Let Xi and pi denote 
the fraction of edges of left degree i and right degree i, respectively. Further, we 
define the generating functions A(a;) = P(^) = The 

rather peculiar look of the exponent of x in these polynomials is an artifact of 
the particular message passage decoding that we are using. This is best explained 
by the analysis itself, which I will now describe in an informal way. 

Let e be an edge between the message node m and the constraint node 
c. What is the probability that this edge is deleted at the £th round of the 
algorithm? This is the probability that the check node c is of degree one at the 
£th round, and, equivalently, it is the probability that the message node m is 
corrected at that round. To compute this probability, we unroll the graph in 
the neighborhood of the node m and consider the subgraph obtained by the 
neighborhood of depth ^ of m. This is the subgraph of all the nodes in the graph 
except those that are connected to m via the edge e, for which there is a path 
of length at most 2i connecting them to m. In the following we will assume that 
this graph is a tree. Suppose that the graph is sampled uniformly at random from 
the set of graphs which have an edge distribution according to the polynomials 
A(x) and p{x). Let pf, denote the probability that m is not corrected at round £. 
Further, let 5 denote the original fraction of erasures. Then, obviously po = S. 
Further, because we have assumed that the neighborhood of to is a tree, at each 
level i of the tree the message nodes are still erased with independent probability 
Pii- (We assume that only the message nodes contribute to levels in the tree, 
so that the message nodes forming the leaves are at level 0 and the root to is 
at level £.) ^From this, we can establish a recursion for p£. A message node at 
level t' + 1 is not corrected if and only if it has not been received directly, and all 
the constraint nodes it is connected to have degree larger than 1. A constraint 
node has degree one if and only if all its descending message nodes at level £ 
have already been corrected. This happens with independent probability 1— pi, 
and since the message node has j edges emanating from it with probability pj, 
and j — 1 of them are descending message nodes in the tree, the probability that 
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such a check node is of degree one is p(l — pi). Hence, the probability that a 
message node at level ^ + 1 is connected only to descending constraint nodes of 
degree larger than 1 is A(1 — p(l — pi))- That node is thus not corrected with 
probability 5\{1 — p{l — pi)) , where the factor 5 explains the probability that the 
node has not been received directly. Hence, this gives pi+i = i5A(l — p(l — pi)). 
Altogether, we obtain the condition 

5\{l - p{\ - pi)) < PI (1) 

for successful decoding. More precisely, this says that if neighborhoods of depth 
£ of message nodes are trees, and if (5A(l — /o(l — a:)) < (1 — e)a; for a:: e (0,5), then 
after ^ rounds of the algorithm the probability that a message node has not been 
corrected is at most (1 — e)^5. For large random graphs the probability that the 
neighborhood of a message node is not a tree is small, and the argument shows 
that the decoding algorithm reduces the probability of undecoded message node 
below any constant. To show that the process finishes successfully, one needs 
expansion HD]. 

The above informal discussion can be made completely rigorous using proper 
martingale arguments [?SII4^ . Summarizing, the condition for successful decoding 
after a 5-fraction of erasures is 

5A(1 — /o(l — a;)) < a; for a;G(0,5). (2) 



3.4 Capacity Achieving Sequences 

The condition (0 is very handy if one wants to analyse the performance of 
random graphs with a given degree distribution. For instance, it turns out that 
the performance of regular graphs deteriorates as the degree of the message nodes 
increases m- In fact, the best performance is obtained if all message nodes have 
degree three. On the other hand, this condition does not give a clue on how to 
design good degree distributions A and p. Our aim is to construct sequences that 
asymptotically achieve the capacity of the erasure channel. In other words, we 
want 5 in 0 to be arbitrarily close to 1 — i?, where R is the rate of the code. 
To make this definition more rigorous, we call a sequence {Xi,pi)i>o capacity- 
achieving of rate R if (a) the corresponding graphs give rise to codes of rate at 
least i?, and (b) for all e > 0 there exists an Iq such that for all i > £q we have 

(1 — i?)(l — e)A(l — p{l — x)) < X for x G (0, (1 — i?)(l — e)). 

It is surprising that such sequences do really exist. The first such sequence was 
discovered in PH. To describe it, we first need to mention that, given A and 
p, the average left and right degree of the graph is and ^IJiiPifi, 

respectively. These quantities can be conveniently expressed as 1/ X(x)dx and 
1/ /p p(x)dx. As a result, the rate of the code is at least 1 — /q p(x)dx/ X(x)dx. 
It is a nice exercise to deduce from the equation 0 alone that 5 is always less 
than or equal to 1 — i?, i.e., less than or equal to p(x)dx/ X(x)dx. 
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The first examples of capacity-achieving sequences of any given rate R were 
discovered in dU, and I will describe them in the following: fix a parameter D 
and let \d{x) := SiLi x'‘ fi, where H{D) is the harmonic sum !/*■ 

Since An(x)dx = 77^(1 — 1/(D + 1)). Let pd{x) := where p is the 

unique solution to the equation 



^ H{D) 



1 - 



1 



I? -hi 



Then the sequence {Xd{x), pd{x))d>i gives rise to codes of rate at least R. 
Further, we have 

^Ad (1 - pd( 1 - x)) = SXd{1 - 

6pLX 

^h(d)' 



Hence, successful decoding is possible if the fraction of erasures is no more than 
H{D)/p. Note that this quantity equals (1 — i?)(l — 1/(1A -h 1))/(1 — e“^), and 
that this quantity is larger than (1 — i?)(l — 1/D). Hence, we have that 



(l-i?)(l-l/Zl)AD(l-pD(l-a;)) <a; for a; G (0, (1 - i?)(l - 1/D)). 



This shows that the sequence is indeed capacity achieving. We have named these 
sequences the Heavy-Tail/Poisson sequences, or, more commercially oriented. 
Tornado codes. 

In the meantime, I have obtained yet another capacity achieving sequence 
whose left side is closely related to the power series expansion of (1 — 
and which is right-regular, i.e., all nodes on the right have the same degree HS|. 
More precisely, the new sequence is defined as follows. For integers a > 2 and 
n > 2 let 



Pa{x) := x°- \ 



Xa.n{x) 



1 



where a := l/(a — 1). For the correct choice of the parameter n and other 
properties of these sequences we refer the reader to m- 

I would like to close this section with a few comments on the trade-off between 
proximity to the channel capacity and the running time of the decoder. For the 
Heavy-Tail/Poisson sequence the average degree of a message node was less than 
H{D), and it could tolerate up to (1 — i?)(l — 1/Zl) fraction of erasures. Hence, 
to get close to within 1 — e of the capacity 1 — i?, we needed codes of average 
degree 0(log(l/e)). This is shown to be essentially optimal in In other 
words, to get within I — e of the channel capacity, we need graphs of average 
degree J7(log(l/e)). The same relation also holds for the right-regular sequences. 
Hence, these codes are essentially optimal for our simple decoders. 



10 



M. Amin Shokrollahi 



4 Codes on Other Channels 



In this section we will briefly describe some of the most recent developments on 
the field of LDPC-codes. Already in 1998, Luby, Mitzenmacher, Spielman, and 
myself |E1 started to adapt the analysis of [Zj to the situation of simple decoders 
of Gallager for transmission over a binary symmetric channel. To my knowledge, 
this was the first rigorous analysis of a probabilistic decoder for LDPC codes. 
One common feature between these decoders and our simple erasure decoder 
described above is the following: at each round of the iteration, one has to keep 
track of only one real variable; in the case of the erasure decoder this variable 
describes the probability of a message node being still erased. In the case of 
Gallager’s decoders it equals the probability of a message node being in error. 

The analysis of more powerful decoders like the belief-propagation decoder is 
more complicated, as, at each round, one has to keep track of a density function 
describing the distribution of various values at a message node. (For a description 
of the belief-propagation algorithm, we refer the reader to m-) Nevertheless, 
Richardson and Urbanke managed to generalize the analysis of m to this case 
as well. One of the main results of that paper is the derivation of a recursion 
for the (common) density functions of the message nodes at each iteration of 
the algorithm. The analysis was further simplified in m, and will be described 
below. First, we assume that the input alphabet is the set {±1}. At each round, 
the algorithm passes messages from message nodes to check nodes, and then 
from check nodes to message nodes. We assume that at the message nodes the 
messages are represented as log-likelihood ratios 



log 



p{y\x = 1) 

p{y\x = - 1 ) ’ 



where y represents all the observations conveyed to the message node at that 
time. Now let denote the probability density function at the message nodes 
at the £th round of the algorithm, /o is then the density function of the error 
which the message bits are originally exposed to. It is also denoted by Pq. These 
density functions are defined on the set KU{±oo}. It turns out that they satisfy a 
symmetry eondition II3|/( —x) = f{x)e As a result, the value of any of these 
density functions is determined from the set of its values on the set K>o U {oo}. 
The restriction of a function / to this set is denoted by /-°. (The technical 
difficulty of defining a function at oo could be solved by using distributions 
instead of functions, but we will not further discuss it here.) 

For a function / defined on M>o U {oo} we define a hyperbolic change of 
measure 7 via 

7 (/)(a;) := /(lncotha;/ 2 )csch(a:). 

If / is a function satisfying the symmetry condition, then defines a func- 

tion on K>o U{oo| which can be uniquely extended to a function F on KU{±oo}. 
The transformation mapping f to F is denoted by T. It is a bijective mapping 
from the set of density functions on K U {± 00 } satisfying the symmetry condi- 
tion into itself. Let fi denote the density of the common density function of the 
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messages passed from message nodes to check nodes at round I of the algorithm, 
/o then denotes the density of the original error, and is also denoted by Pq- 
Suppose that the graph has a degree distribution given by A(a;) and p(x\ Then 
we have the following: 

/, = Po®A(p-i(p(r(/,_i)))), £>1. (3) 

Here, ® denotes the convolution, and for a function /, A(/) denotes the function 
In the case of the erasure channel, the corresponding density func- 
tions are two-point mass functions, with a mass at zero and a mass (1 — pi) 
at infinity. In this case, the iteration translates to US! 

Pi = (5A(1 - p(l -p£_i)), 

where 5 is the original fraction of erasures. This is exactly the same as in |IJ). 

5 Conclusion and Open Problems 

There are still a large number of important open questions about LDPC codes. 
Among them I would like to single out two topics which I call the “asymp- 
totic theory” and “short codes”. As the name suggests, the asymptotic theory 
deals with the asymptotic performance of various decoding algorithms for LDPC 
codes. As discussed above, a lot of progress has been made in the asymptotic 
analysis of belief propagation and other algorithms. One of the main important 
open questions is, for each given algorithm, the design of the degree structure 
of the underlying graphs such that the corresponding codes perform optimally 
with respect to the given decoding algorithm. This is particularly important for 
the case of belief-propagation. Here we go even further, and ask about capacity- 
achieving sequences of degree distributions. In other words, given a channel with 
capacity C, we want for any given e explicit degree distributions Ag (x) and (x) 
such that, asymptotically, LDPC codes obtained from sampling graphs with 
these distributions perform at rates that are within e of (7 when decoded using 
belief-propagation. The only case for which we know such sequences is that of 
the erasure channel [lUlltij . We conjecture that such sequences exist for other 
channels like the AWGN channel, or the BSC, as well. 

The topic of “short codes” deals with the construction and analysis of good 
“short” codes. Here we only mention the question of a rigorous analysis of codes 
of finite length (rather than the asymptotic analysis discussed in the last para- 
graph). The problem that arises here is that for most of the message nodes, 
the neighborhood is a tree only for very small depths. In other words, the de- 
coder works on graphs that have cycles, and the analysis described above is not 
adequate. 

References 

1. N. Alon and M. Luby. A linear time erasure-resilient code with nearly optimal 
recovery. IEEE Trans. Inform. Theory, 42:1732-1736, 1996. 



12 



M. Amin Shokrollahi 



2. S. Arora and S. Safra. Probabilistic checking of proofs: a new characterization of 
NP. J. ACM, 45:70-122, 1998. 

3. E.R. Berlekamp, R.J. McEliece, and H.C.A. van Tilborg. On the inherent in- 
tractability of certain coding problems. IEEE Trans. Inform. Theory, 24:384-386, 
1978. 

4. C. Berroux, A. Glavieux, and P. Thitimajshima. Near Shannon limit error- 
correcting coding and decoding. In Proceedings of ICC’93, pages 1064-1070, 1993. 

5. P. Elias. Coding for two noisy channels. In Information Theory, Third London 
Symposium, pages 61-76, 1955. 

6. R. G. Gallager. Low Density Parity- Check Codes. MIT Press, Cambridge, MA, 
1963. 

7. M. Luby, M. Mitzenmacher, and M.A. Shokrollahi. Analysis of random processes 
via and-or tree evaluation. In Proeeedings of the 9th Annual ACM-SIAM Sympo- 
sium on Discrete Algorithms, pages 364-373, 1998. 

8. M. Luby, M. Mitzenmacher, M.A. Shokrollahi, and D. Spielman. Analysis of low 
density codes and improved designs using irregular graphs. In Proeeedings of the 
30th Annual ACM Symposium on Theory of Computing, pages 249-258, 1998. 

9. M. Luby, M. Mitzenmacher, M.A. Shokrollahi, and D. Spielman. Improved low- 
density parity-check codes using irregular graphs and belief propagation. In Pro- 
ceedings 1998 IEEE International Symposium on Information Theory, page 117, 
1998. 

10. M. Luby, M. Mitzenmacher, M.A. Shokrollahi, D. Spielman, and V. Stemann. 
Practical loss-resilient codes. In Proceedings of the 29th annual ACM Symposium 
on Theory of Computing, pages 150-159, 1997. 

11. D.J.C. MacKay. Good error-correcting codes based on very sparse matrices. IEEE 
Trans. Inform. Theory, 45:399-431, 1999. 

12. G. A. Margulis. Explicit constructions of graphs without short cycles and low 
density codes. Combinatoriea, 2:71-78, 1982. 

13. T. Richardson, M.A. Shokrollahi, and R. Urbanke. Design of provably good low- 
density parity check codes. IEEE Trans. Inform. Theory (submitted), 1999. 

14. T. Richardson and R. Urbanke. The capacity of low-density parity check codes 
under message-passing decoding. IEEE Trans. Inform. Theory (submitted), 1998. 

15. C. E. Shannon. A mathematical theory of communication. Bell System Tech. J., 
27:379-423, 623-656, 1948. 

16. M.A. Shokrollahi. New sequences of linear time erasure codes approaching the 
channel capacity. To appear in the Proceedings of AAEGGT3, 1999. 

17. M. Sipser and D. Spielman. Expander codes. IEEE Trans. Inform. Theory, 
42:1710-1722, 1996. 

18. D. Spielman. Linear-time encodable and decodable error-correcting codes. IEEE 
Trans. Inform. Theory, 42:1723-1731, 1996. 

19. M. R. Tanner. A recursive approach to low complexity codes. IEEE Trans. In- 
form. Theory, 27:533-547, 1981. 

20. V. V. Zyablov. An estimate of the complexity of constructing binary linear cascade 
codes. Probl. Inform. Transm., 7:3-10, 1971. 

21. V. V. Zyablov and M. S. Pinsker. Estimation of error-correction complexity of 
Gallager low-density codes. Probl. Inform. Transm., 11:18-28, 1976. 




A Classification of Symbolic Transition Systems* 



Thomas A. Henzinger and Rupak Majumdar 



Department of Electrical Engineering and Computer Sciences 
University of California at Berkeley, CA 94720-1770, USA 
{tcLh,rupak}@eecs .berkeley.edu 



Abstract. We define five increasingly comprehensive classes of infinite- 
state systems, called STSl-5, whose state spaces have finitary structure. 

For four of these classes, we provide examples from hybrid systems. 

STSl These are the systems with finite bisimilarity quotients. They can 
be analyzed symbolically by (1) iterating the predecessor and boolean op- 
erations starting from a finite set of observable state sets, and (2) termi- 
nating when no new state sets are generated. This enables model checking 
of the /r-calculus. 

STS2 These are the systems with finite similarity quotients. They can be 
analyzed symbolically by iterating the predecessor and positive boolean 
operations. This enables model checking of the existential and universal 
fragments of the ^-calculus. 

STS3 These are the systems with finite trace- equivalence quotients. They 
can be analyzed symbolically by iterating the predecessor operation and 
a restricted form of positive boolean operations (intersection is restricted 
to intersection with observables). This enables model checking of linear 
temporal logic. 

STS4 These are the systems with finite distance-equivalence quotients 
(two states are equivalent if for every distance d, the same observables 
can be reached in d transitions). The systems in this class can be ana- 
lyzed symbolically by iterating the predecessor operation and terminat- 
ing when no new state sets are generated. This enables model checking of 
the existential conjunction-free and universal disjunction-free fragments 
of the /r-calculus. 

STS5 These are the systems with finite bounded-reachability quotients 
(two states are equivalent if for every distance d, the same observables 
can be reached in d or fewer transitions). The systems in this class can be 
analyzed symbolically by iterating the predecessor operation and termi- 
nating when no new states are encountered. This enables model checking 
of reachability properties. 

* This research was supported in part by the DARPA (NASA) grant NAG2-1214, the 
DARPA (Wright-Patterson AFB) grant F33615-C-98-3614, the MARCO grant 98- 
DT-660, the ARO MURI grant DAAH-04-96-1-0341, and the NSF CAREER award 
CCR-9501708. 
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0 Introduction 

To explore the state space of an infinite-state transition system, it is often con- 
venient to compute on a data type called “region,” whose members represent 
(possibly infinite) sets of states. Regions might be implemented, for example, as 
constraints on the integers or reals. We say that a transition system is “sym- 
bolic” if it comes equipped with an algebra of regions which permits the effective 
computation of certain operations on regions. For model checking, we are par- 
ticularly interested in boolean operations on regions as well as the predecessor 
operation, which, given a target region, computes the region of all states with 
successors in the target region. While a region algebra supports individual op- 
erations on regions, the iteration of these operations may generate an infinite 
number of distinct regions. In this paper, we study restricted classes of symbolic 
transition systems for which certain forms of iteration, if terminated after a finite 
number of operations, still yield sufficient information for checking interesting, 
unbounded temporal properties of the system. 



0.1 Symbolic Transition Systems 

Definition: Symbolic transition system A symbolic transition system S — 
{Q,S,R,’~-~',P) consists of a (possibly infinite) set Q of states, a (possibly non- 
deterministic) transition function 5 : Q ^ 2^ which maps each state to a set 
of successor states, a (possibly infinite) set R of regions, an extension function 
i? — > 2^ which maps each region to a set of contained states, and a finite 
set P Q R oi observables, such that the following six conditions are satisfied: 

1. The set P of observables covers the state space Q; that is, \ P ^ P} = 

Q. 

2. For each region a G R, there is a region Pre{a) € R such that 

'”Pre((j)”' = {s G Q \ {3t G S(s) : t G cr)}; 

furthermore, the function Pre : i? — > i? is computable. 

3. For each pair cr, t G i? of regions, there is a region And{a,T) G R such that 
^ And{a,T)~^ = '"cr”' n '"r”'; furthermore, the function And : i? x i? — > i? is 
computable. 

4. For each pair a,r G R oi regions, there is a region Diff{a, t) G R such that 

Diff{a,T)~' = '"cr”'\'”T”'; furthermore, the function Diff : i? x i? — > i? is 
computable. 

5. All emptiness questions about regions can be decided; that is, there is a 
computable function Empty, i? ^ B such that Empty{a) iff '"cr”' = 0. 

6. All membership questions about regions can be decided; that is, there is 
a computable function Member : Q x i? — > B such that Member{s,a) iff 
s G 



The tuple TZs = {P, Pre, And, Diff, Empty) is called the region algebra of S. □ 
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Remark: Duality We take an existential view of symbolic transition systems. 
The dual, universal view requires (1) HI'”?”' I P G P} = 0, (2-4) closure of R 
under computable functions Pre, And, and Diff such that 

’~Pre{a)~' = {s G Q I (Vt G 5(s) : t G cr)}, 

And{a,T)~' = ’“cr”' U and Diff = Q\'~Diff{T,a)~', and (5) a com- 

putable function Empty for deciding all universality questions about regions 
(that is, Empty{a) iff ’"cr”' = Q). All results of this paper have an alternative, 
dual formulation. □ 

0.2 Example: Polyhedral Hybrid Automata 

A polyhedral hybrid automaton H of dimension m, for a positive integer m, 
consists of the following components lAHHllbl : 

Continuous variables A set X = {xi, . . . ,Xm} of real- valued variables. We 
write X for the set {xi, . . . ,Xm} of dotted variables (which represent first 
derivatives during continuous change), and we write X' for the set . . . , 
of primed variables (which represent values at the conclusion of discrete 
change). A linear constraint over X is an expression of the form feg k\Xi + 

• • • + kmXm, where ^ G {<, <, =, >, >} and kg, . . . ,km are integer constants. 
A linear predicate over A is a boolean combination of linear constraints 
over X. Let L"* be the set of linear predicates over X. 

Discrete locations A finite directed multigraph {V,E). The vertices in V are 
called locations' the edges in E are called jumps. 

Invariant and flow conditions Two vertex-labeling functions inv and flow. 
For each location v G V, the invariant condition inv(v) is a conjunction of 
linear constraints over X , and the flow condition flow(v) is a conjunction of 
linear constraints over X. While the automaton control resides in location v, 
the variables may evolve according to flow(v) as long as inu(v) remains true. 
Update conditions An edge-labeling functions update. For each jump e G E, 
the update condition update{e) is a conjunction of linear constraints over 
A U A'. The predicate update{e) relates the possible values of the variables 
at the beginning of the jump (represented by A) and at the conclusion of 
the jump (represented by A'). 

The polyhedral hybrid automaton iL is a rectangular automaton |HkPV!^A| if 

— all linear constraints that occur in invariant conditions of El have the 
form X ^ k, for x G X and k G Z' 

— all linear constraints that occur in flow conditions of H have the form 
i; ~ A:, for a; G A and fc G Z; 

— all linear constraints that occur in jump conditions of El have the form 
X ^ k or x' = x or x' k, for x G X and k G h; 

— if e is a jump from location v to location v', and update(e) contains 
the conjunct x' = x, then both flow{v) and flowin') contain the same 
constraints on x. 
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The rectangular automaton H is a, singular automaton if each flow condition of 
H has the form x\ = k\ f\ . . . f\ Xm = ^m- The singular automaton H is & timed 
automaton if each flow condition of H has the form xi = 1 A . . - Aim = 1- 

The polyhedral hybrid automaton H defines the symbolic transition system 
Sh = {Qh,Sh,Rh,'~-~'h,Ph) with the following components: 

“ Qh = V X K™; that is, every state {v, x) consists of a location v (the discrete 
component of the state) and values x for the variables in X (the continuous 
component). 

— (u',x') S 6h{v,x) if either (1) there is a jump e G E from v to v' such 
that the closed predicate update{e)[X, X' := x,x'] is true, or (2) v' = v and 
there is a real Zi > 0 and a differentiable function / : [0, Z\] ^ K™ with first 
derivative / such that /(O) = x and f{A) = x', and for all reals e G (0, A), 
the closed predicates inv{v)[X := f{e)] and flow{v)[X := /(e)] are true. In 
case (2), the function / is called a flow function. 

— Rh = Vx L"*; that is, every region {v, (f) consists of a location v (the discrete 
component of the region) and a linear predicate (j) over X (the continuous 
component). 

— = {(^^jX) I X G K™ and (p\X := x] is true}; that is, the extension 
function maps the continuous component (f oi & region to the values for the 
variables in X which satisfy the predicate 4>. Consequently, the extension of 
every region consists of a location and a polyhedral subset of K™ . 

— Ph = V x{true}; that is, only the discrete component of a state is observable. 

It requires some work to see that Sh is indeed a symbolic transition system. First, 
notice that the linear predicates over X are closed under all boolean operations, 
and that satisfiability is decidable for the linear predicates. Second, the Pre 
operator is computable on Rh, because all flow functions can be replaced by 
straight lines !AHH96j . 

0.3 Background Definitions 

The symbolic transition systems are a special case of transition systems. A tran- 
sition system S = {Q, 6, P) has the same components as a symbolic tran- 
sition system, except that no regions are specified and the extension function is 
defined only for the observables (that is, P ^ 2*5). 

State equivalences A state equivalence = is a family of relations which contains 
for each transition system S an equivalence relation on the states of S. 
The = equivalence problem for a class C of transition systems asks, given two 
states s and t of a transition system S from the class C, whether s t. The 
state equivalence =a is as coarse as the state equivalence =b if s =f t implies 
s =f t for all transition systems S. The equivalence =q is coarser than =b 
if =a is as coarse as =&, but =b is not as coarse as =q. Given a transition 
system S = {Q, S, •, P) and a state equivalence =, the quotient system is the 
transition system 5/si = (Q/s^, <5/ a:, T*) with the following components: 
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— the states in S/^ are the equivalence classes of = 5 ; 

— T G (5/si(cr) if there is a state s G a and a state t G r such that t G i5(s); 

— cr G if there is a state s G ct such that s G 

The quotient construction is of particular interest to us when it transforms an 
infinite-state system S into a finite-state system 5/=;. 

State logics A state logic L is a logic whose formulas are interpreted over the 
states of transition systems; that is, for every L-formula p and every transition 
system S, there is a set of states of S which satisfy ip. The L model- 
checking problem for a class C of transition systems asks, given an L-formula tp 
and a state s of a transition system S from the class C, whether s G Two 

formulas tp and ip of state logics are equivalent if |(p ]5 = |V ’]5 for all transition 
systems S. The state logic La is as expressive as the state logic L{, if for every 
Lft-formula ip, there is an La-formula ip which is equivalent to ip. The logic La is 
more expressive than Lb if La is as expressive as Lb, but Lb is not as expressive 
as La. Every state logic L induces a state equivalence, denoted =l: for all states 
s and t of a transition system S, define s =f t if for all L-formulas ip, we have 
■5 G [‘pjs iff t G The state logic L admits abstraction if for every L-formula 

ip and every transition system S, we have |(^]5 = 1J{'^ I }i that is, 

a state s of 5 satisfies an L-formula ip iff the = l equivalence class of s satisfies 
ip in the quotient system. Consequently, if L admits abstraction, then every 
L model-checking question on a transition system S can be reduced to an L 
model-checking question on the induced quotient system 5/^^. Below, we shall 
repeatedly prove the L model-checking problem for a class C to be decidable by 
observing that for every transition system S from C, the quotient system 
has finitely many states and can be constructed effectively. 

Symbolic semi-algorithms A symbolic semi- algorithm takes as input the re- 
gion algebra TZs = {P, Pre, And, Diff , Empty) of a symbolic transition system 
S = (Q,6,R,’~-~',P), and generates regions in R using the operations P, Pre, 
And, Diff , and Empty. Depending on the input S, a symbolic semi-algorithm 
on S may or may not terminate. 

0.4 Preview 

In sections 1-5 of this paper, we shall define five increasingly comprehensive 
classes of symbolic transition systems. In each case i G {1, . . . , 5}, we will proceed 
in four steps: 

1 Definition: Finite characterization We give a state equivalence =i and 
define the class STS(t) to contain precisely the symbolic transition systems S 
for which the equivalence relation =f has finite index (i.e., there are finitely 
many =f equivalence classes). Each state equivalence =i is coarser than its 
predecessor =i-\, which implies that STS(i — 1) C STS(z) for i G {2, . . . , 5}. 

2 Algorithmics: Symbolic state-space exploration We give a symbolic 
semi-algorithm that terminates precisely on the symbolic transition systems in 
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the class STS(i). This provides an operational characterization of the class STS(i) 
which is equivalent to the denotational definition of STS(i). Termination of the 
semi-algorithm is proved by observing that if given the region algebra of a sym- 
bolic transition system S as input, then the extensions of all regions generated 
by the semi-algorithm are =f blocks (i.e., unions of =f equivalence classes). 
If S is in the class STS(i), then there are only finitely many =f blocks, and 
the semi-algorithm terminates upon having constructed a representation of the 
quotient system S/ . The semi-algorithm can therefore be used to decide all =i 
equivalence questions for the class STS(i). 

3 Verification: Decidable properties We give a state logic Li which admits 
abstraction and induces the state equivalence =i. Since =i quotients can be 
constructed effectively, it follows that the Li model-checking problem for the 
class STS(i) is decidable. However, model-checking algorithms which rely on 
the explicit construction of quotient systems are usually impractical. Hence, we 
also give a symbolic semi-algorithm that terminates on the symbolic transition 
systems in the class STS(i) and directly decides all Li model-checking questions 
for this class. 

4 Example: Hybrid systems The interesting members of the class STS(i) are 
those with infinitely many states. In four out of the five cases, following EiiMI, 
we provide certain kinds of polyhedral hybrid automata as examples. 



1 Class-1 Symbolic Transition Systems 

Class-1 systems are characterized by finite bisimilarity quotients. The region 
algebra of a class- 1 system has a finite subalgebra that contains the observables 
and is closed under Fre, And, and Diff operations. This enables the model 
checking of all /i-calculus properties. Infinite-state examples of class-1 systems 
are provided by the singular hybrid automata. 



1.1 Finite Characterization: Bisimilarity 

Definition: Bisimilarity Let S = {Q,S, P) be a transition system. A 
binary relation A on the state space Q is a simulation on 5 if s A f implies the 
following two conditions: 

1. For each observable p G P, we have s G '~p~' iff t S '~p~'. 

2. For each state s' G S(s), there is a state t' G S{t) such that s' A t' . 

Two states s,t G Q are bisimilar, denoted s =f t, if there is a symmetric 
simulation A on 5 such that s A t. The state equivalence =i is called bisimilarity. 

□ 

Definition: Class STSl A symbolic transition system S belongs to the class 
STSl if the bisimilarity relation =f has finite index. □ 
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Symbolic semi-algorithm Closurel 

Input: a region algebra TZ = {P, Pre, And, Diff, Empty). 

To ~ P; 

for i = 0, 1, 2, . . . do 

Ti+i := Ti 

U {Pre(cr) I a G T} 

U {And{a,T) I cr, r G T} 

U {Diff (a, t) \ct,t G T) 
until Ti+r C '-PP. 

The termination test C ^Tp, which is shorthand for {''cr'' | cr G Ti+i} C 

I <j G Ti}, is decided as follows: for each region cr G P+i check that there is 
a region t G T such that both Empty {Diff {a, t)) and Empty {Diff {t, a)). 



Fig. 1. Partition refinement 



1.2 Symbolic State-Space Exploration: Partition Refinement 

The bisimilarity relation of a finite-state system can be computed by partition 
refinement [KS9()| . The symbolic semi-algorithm Closurel of Figure ^ applies 
this method to infinite-state systems IfiFHhbltFCTl . Suppose that the input 
given to Closurel is the region algebra of a symbolic transition system S = 
{Q, 6, R, P). Then each Ti, for i > 0, is a finite set of regions; that is, Ti C R. 
By induction it is easy to check that for all z > 0, the extension of every region 
in Ti is a =f block. Thus, if =f has finite index, then Closurel terminates. 
Conversely, suppose that Closurel terminates with ^Ti+P C ^TP. From the 
definition of bisimilarity it follows that if for each region a G Ti, we have s G '"cr”' 
iff t G '"cr”', then s =f t. This implies that =f has finite index. 

Theorem lA For all symbolic transition systems S , the symbolic semi- algorithm 
Closurel terminates on the region algebra TZs W ^ belongs to the class STSl. 

Corollary lA The =i (bisimilarity) equivalence problem is decidable for the 
class STSl of symbolic transition systems. 

1.3 Decidable Properties: Branching Time 

Definition: /r-calculus The formulas of the p-calculus are generated by the 
grammar 

(fi ::= p\p\x\if\/ Lp\p Aif\3C)ip\^Oip\ {px: (p) \ {vx: p), 

for constants p from some set II, and variables x from some set X. Let S = 
(Q, (5, P) be a transition system whose observables include all constants; 
that is, n f- P. Let £■. X ^ 2^ he & mapping from the variables to sets of 
states. We write £\x ^ p] for the mapping that agrees with £ on all variables, 
except that a: G A is mapped to p C Q. Given S and £, every formula p of the 
p-calculus defines a set |<p]s,£: C Q of states: 



20 



Thomas A. Henzinger and Rupak Majumdar 



Ipk.f = 

I‘^i{a}v^ 2]5,£ = {n} Iv22ls.£; 

I{v}0<p]5.£ = {s € Q I ({^}t G (5(s): tG M5.£)}; 

= {u}{p C Q I P = M5,£[rr.^p]}- 

If we restrict ourselves to the closed formulas of the /r-calculus, then we obtain a 
state logic, denoted L^: the state s G Q satisfies the L^-formula </? if s G |<<3]5,£: 
for any variable mapping £; that is, |(^]5 = for any £. □ 

Remark: Duality For every L^-formula ip, the dual L^-formula ^ is obtained 
by replacing the constructors p, p, V, A, 3Q, VQj M) and v by p, p, A, V, VO> dQ, 
v, and p, respectively. Then, = Q\|</3]5. It follows that the answer of the 
model-checking question for a state s G Q and an -formula p is complementary 
to the answer of the model-checking question for s and the dual formula p. □ 

The following facts about the /r-calculus are relevant in our context \snm- 
First, admits abstraction, and the state equivalence induced by is =i 
(bisimilarity). Second, is very expressive; in particular, is more expressive 
than the temporal logics Ctl* and Ctl, which also induce bisimilarity. Third, 
the definition of L)' naturally suggests a model-checking method for finite-state 
systems, where each fixpoint can be computed by successive approximation. The 
symbolic semi-algorithm ModelCheck of Figure 0 applies this method to infinite- 
state systems. 

Suppose that the input given to ModelCheck is the region algebra of a symbolic 
transition system S — {Q, 6, R, P), a /i-calculus formula ip, and any mapping 
E : X ^ 2^ from the variables to sets of regions. Then for each recursive 
call of ModelCheck, each Ti, for i > 0, is a finite set of regions from R, and 
each recursive call returns a finite set of regions from R. It is easy to check 
that all of these regions are also generated by the semi-algorithm Closurel on 
input TZs- Thus, if Closurel terminates, then so does ModelCheck. Furthermore, 
if it terminates, then ModelCheck returns a set [ip]E Q R oi regions such that 
lj{'”cr”' I tJ G [p\e} = where £{x) = I ^ ^ ^{^)} a: G X. In 

particular, if p is closed, then a state s G Q satisfies <p iff Member{s, a) for some 
region cr G [<p\e- 

Theorem IB. For all symbolic transition systems S in STSl and every L^- 
formula ip, the symbolic semi-algorithm ModelCheck terminates on the region 
algebra TZs ^m-d the input formula ip. 

Corollary IB The Lj model- checking problem is decidable for the class STSl 
of symbolic transition systems. 

1.4 Example: Singular Hybrid Automata 

The fundamental theorem of timed automata shows that for every timed 

automaton, the (time-abstract) bisimilarity relation has finite index. The proof 
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Symbolic semi-algorithm ModelCheck 

Input: a region algebra TZ = (P, Pre, And, Diff , Empty), a formula ip € 

I/j , and a mapping E with domain X. 

Output: [p]e := 

if p = p then return {p}; 
if p =p then return {DijJ{q,p) \ q G P}; 
if p = {p\ V P 2 ) then return [pi]e U [p 2 \e\ 
if p = {p\ A P 2 ) then 

return {And{a,T) \ a G [pi]e and r G \p 2 \e}', 
if = 3Q then return {Pre(a) \ a G 
if p = VO v' then return P\\{Pre{a) \ a G (P\\[</5^]-b)}; 
if p = {fix: p') then 

To := 0 ; 

for i = 0, 1,2, . . . do 

Ti+i := [p']E[x^Ti] 

until I e Ti+i} C | <7 G Ti}; 

return Tt; 
if p = (vx: p') then 

To ~ P -, 

for i = 0, 1,2, . . . do 

Ti+i := [p']E[x^Ti] 

until I e Ti+i} D I e Ti}-, 

return Ti. 

The pairwise- difference operation T\\T' between two finite sets T and T' of regions 
is computed inductively as follows: 

T\\0 = T- 

T\\{{r} U T') = {D^ffia,T) \ a G r}\\T'. 

The termination test |J{''cr'' | n G T} C I a G T^} is decided by checking 

that Empty{a) for each region cr G (T\\T'). 



Fig. 2. Model checking 



can be extended to the singular automata |ACH~*~9,^ . It follows that the sym- 
bolic semi-algorithm ModelCheck, which has been implemented for polyhedral 
hybrid automata in the tool HyTech |HHWT9H) , decides all model-checking 
questions for singular automata. The singular automata form a maximal class 
of hybrid automata in STSl. This is because there is a 2D (two-dimensional) 
rectangular automaton whose bisimilarity relation is state equality EiiMl- 



Theorem 1C The singular automata belong to the class STSl. There is a 2D 
rectangular automaton that does not belong to STSl. 
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2 Class-2 Symbolic Transition Systems 

Class-2 systems are characterized by finite similarity quotients. The region alge- 
bra of a class-2 system has a finite subalgebra that contains the observables and 
is closed under Pre and And operations. This enables the model checking of all 
existential and universal ^-calculus properties. Infinite-state examples of class-2 
systems are provided by the 2D rectangular hybrid automata. 

2.1 Finite Characterization: Similarity 

Definition: Similarity Let 5 be a transition system. Two states s and t of 5 
are similar, denoted s =f t, if there is a simulation A on 5 such that both s A t 
and t A s. The state equivalence =2 is called similarity. □ 

Definition: Class STS2 A symbolic transition system S belongs to the class 
STS2 if the similarity relation =2 has finite index. □ 

Since similarity is coarser than bisimilarity IM!], the class STS2 of symbolic 
transition systems is a proper extension of STSl. 

2.2 Symbolic State-Space Exploration: Intersection Refinement 

The symbolic semi-algorithm Closure2 of Figure 0 is an abstract version of the 
method presented in [HHKflfij for computing the similarity relation of an infinite- 
state system. Suppose that the input given to Closure2 is the region algebra of 
a symbolic transition system S — {Q, S, R, P). Given two states s,t G Q, we 
say that t simulates s if s A t for some simulation A on S. For j > 0 and s G Q, 
define 

Simi{s) = \ (7 GTi and s G '"cr”'}, 

where the set Ti of regions is computed by Closure2. By induction it is easy to 
check that for all i > 0, if t simulates s, then t G Simi{s). Thus, the extension of 
every region in R is a =f block, and if =f has finite index, then Closure2 termi- 
nates. Conversely, suppose that Closure2 terminates with '"Ti+i”' C From 
the definition of simulations it follows that if t G Simi(s), then t simulates s. 
This implies that =f has finite index. 

Theorem 2A For all symbolic transition systems S, the symbolic semi- algorithm 
Closure2 terminates on the region algebra TZs iff S belongs to the class STS2. 

Corollary 2 A The =2 (similarity) equivalence problem is decidable for the class 
STS2 of symbolic transition systems. 

2.3 Decidable Properties: Negation-Free Branching Time 

Definition: Negation-free /r-calculus The negation-free fx-calculus consists 
of the /r-calculus formulas that are generated by the grammar 

(fi ::= p \ X \ (fi\/ ip \ (fi A (fi \ 3Q (fi \ {fix: ip) \ {vx: (fi), 
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Symbolic semi-algorithm Closure2 

Input: a region algebra TZ = {P, Pre, And, Diff, Empty). 

To ~ P; 

for i = 0, 1, 2, . . . do 

Ti+i := Ti 

U {Pre(cr) I a G T} 

U {And{a,T) I cr, r G T} 
until Ti+r C 

The termination test C Ti"' is decided as in Figure 0 



Fig. 3 . Intersection refinement 



for constants p G II and variables x G X. The state logic Li^ consists of the 
closed formulas of the negation- free ^-calculus. The state logic Li^ consists of 
the duals of all L^'formulas. □ 

The following facts about the negation-free /r-calculus and its dual are relevant 
in our context wm- First, both L2 and L2 admit abstraction, and the state 
equivalence induced by both Li^ and L2 is =2 (similarity). It follows that the 
logic with negation is more expressive than either L2 or L^. Second, the 
negation-free logic L2 is more expressive than the existential fragments of Ctl* 
and Ctl, which also induce similarity, and the dual logic L2 i® more expressive 
than the universal fragments of Ctl* and Ctl, which again induce similarity. 

If we apply the symbolic semi-algorithm ModelCheck of Figure El to the region 
algebra of a symbolic transition system S and an input formula from then 
the cases p = p and p = VQ are never executed. It follows that all regions 
which are generated by ModelCheck are also generated by the semi-algorithm 
Closure2 on input TZs. Thus, if Closure2 terminates, then so does ModelCheck. 

Theorem 2B For all symbolic transition systems S in STS2 and every L^- 
formula ip, the symbolic semi-algorithm ModelCheck terminates on the region 
algebra IZs and the input formula ip. 

Corollary 2B The Ltf and L 2 model- checking problems are decidable for the 
class STS2 of symbolic transition systems. 

2.4 Example: 2D Rectangular Hybrid Automata 

For every 2D rectangular automaton, the (time-abstract) similarity relation has 
finite index pT]TKl?5j . It follows that the symbolic semi-algorithm ModelCheck, 
as implemented in HyTech, decides all Llf and Ltf model-checking questions for 
2D rectangular automata. The 2D rectangular automata form a maximal class of 
hybrid automata in STS2. This is because there is a 3D rectangular automaton 
whose similarity relation is state equality [HK 96 j . 
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Theorem 2C The 2D rectangular automata belong to the class STS2. There is 
a 3D rectangular automaton that does not belong to STS2. 



3 Class-3 Symbolic Transition Systems 

Class-3 systems are characterized by finite trace-equivalence quotients. The re- 
gion algebra of a class-3 system has a finite subalgebra that contains the observ- 
ables and is closed under Pre operations and those And operations for which 
one of the two arguments is an observable. This enables the model checking 
of all linear temporal properties. Infinite-state examples of class-3 systems are 
provided by the rectangular hybrid automata. 

3.1 Finite Characterization: Traces 

Definition: Trace equivalence Let S = {Q, S, •, P) be a transition system. 
Given a state sq G Q, a, source-so trace tt of 5 is a finite sequence popi ■ ■ - Pn of 
observables Pi G P such that 

1. So e '“po"'; 

2. for all 0 < i < n, there is a state s^+i G (<5(si) C '"pi+i”'). 

The number n of observables (minus 1) is called the length of the trace tt, the 
final state s„ is the sink of tt, and the final observable p„ is the target of tt. Two 
states s,t & Q are trace equivalent, denoted s =f t, if every source-s trace of S 
is a source-t trace of S, and vice versa. The state equivalence =3 is called trace 
equivalence. □ 

Definition: Class STS3 A symbolic transition system S belongs to the class 
STS3 if the trace-equivalence relation =f has finite index. □ 

Since trace equivalence is coarser than similarity ^nm, the class STS3 of sym- 
bolic transition systems is a proper extension of STS2. 

3.2 Symbolic State-Space Exploration: Observation Refinement 

Trace equivalence can be characterized operationally by the symbolic semi- 
algorithm Closure3 of Figure 0 We shall show that, when the input is the region 
algebra of a symbolic transition system S — (Q,6,R,'~-~',P), then Closure3 ter- 
minates iff the trace-equivalence relation =f has finite index. Furthermore, upon 
termination, s =f t iff for each region a G Ti, we have s G ’"cr”' iff t G ’"cr”'. 

Theorem 3A For all symbolic transition systems S, the symbolic semi-algorithm 
Closure3 terminates on the region algebra TZs iff S belongs to the class STS3. 

Proof |hM^ We proceed in two steps. First, we show that Closure3 terminates 
on the region algebra TZs iff the equivalence relation induced by the linear- 
time /r-calculus (defined below) has finite index. Second, we show that 
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Symbolic semi-algorithm Closures 

Input: a region algebra TZ = {P, Pre, And, Diff, Empty). 

To ~ P; 

for i = 0, 1, 2, . . . do 

Ti+i := Ti 

U {Pre(cr) I a G T} 

U {And{a,p) \ a G Ti and p G P} 
until Ti+r C 

The termination test '~Ti+i~' C Ti"' is decided as in Figure 0 



Fig. 4. Observation refinement 



coincides with trace equivalence. The proof of the first part proceeds as usual. It 
can be seen by induction that for all* > 0, the extension of every region in Ti, as 
computed by Closures, is a ^ block. Thus, if ^ has finite index, then Closures 
terminates. Conversely, suppose that Closures terminates with '"Ti+i”' C '~Ti~'. It 
can be shown that if two states are not ='^M-equivalent, then there is a region in 
Ti which contains one state but not the otlier. It follows that if for each region 
cr G Ti, we have s G iff t C '"cr”', then s=’fi^t. This implies that has finite 
index. 

For the second part, we show that Lg is as expressive as the logic 3Buchi, whose 
formulas are the existentially interpreted Biichi automata, and that BBuchi is 
as expressive as Lg . This result is implicit in a proof by irag. By induction on 
the structure of an Lg-formula if, we can construct a Biichi automaton B,p such 
that for all transition systems S, a state s of 5 satisfies (p iff for some infinite 
source-s trace of S is accepted by Conversely, given a Biichi automaton B, 
we can construct an Lg-formula which is equivalent to 3B [I la,ml)4j . Since the 
state equivalence induced by 3Buchi is trace equivalence, it follows that is 
also trace equivalence. □ 

Corollary 3 A The =3 (trace) equivalence problem is decidable for the class 
STS3 of symbolic transition systems. 

3.3 Decidable Properties: Linear Time 

Definition: Linear-time /i-calculus The linear-time p-calculus (also called 
“Li” in EM) consists of the //-calculus formulas that are generated by the 
grammar 



p ::= p\x\py p\pA‘p\3C)p\ {px: p) \ {vx\ p), 

for constants p G II and variables x G X. The state logic Lg consists of the 
closed formulas of the linear-time /r-calculus. The state logic Lg consists of the 
duals of all Lg -formulas. □ 
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The following facts about the linear-time ^-calculus and its dual are relevant 
in our context (cf. the second part of the proof of Theorem 3A). First, both 
Lg and Lg admit abstraction, and the state equivalence induced by both Lg 
and Lg is =3 (trace equivalence). It follows that the logic with unrestricted 
conjunction is more expressive than Lg, and is more expressive than Lg. 
Second, the logic Lg with restricted conjunction is more expressive than the 
existential interpretation of the linear temporal logic Ltl, which also induces 
trace equivalence. For example, the existential Ltl formula 3{pUq) (“on some 
trace, p until 9 ”) is equivalent to the Lg-formula (px : g V (p A 3Q x)) (notice 
that one argument of the conjunction is a constant). The dual logic Lg is more 
expressive than the usual, universal interpretation of Ltl, which again induces 
trace equivalence. For example, the (universal) Ltl formula pWg (“on all traces, 
either p forever, or p until q”) is equivalent to the Lg-formula {vx: p A VO(gVa;)) 
(notice that one argument of the disjunction is a constant). 

If we apply the symbolic semi-algorithm ModelCheck of Figure 0 to the region 
algebra of a symbolic transition system S and an input formula from Lg, then 
all regions which are generated by ModelCheck are also generated by the semi- 
algorithm Closures on input TZs- Thus, if Closures terminates, then so does Mod- 
elCheck. 

Theorem 3B For all symbolic transition systems S in STSS and every Lg- 
formula ip, the symbolie semi-algorithm ModelCheck terminates on the region 
algebra TZs O'Xid the input formula p. 

Corollary 3B The Lg and Lg model- checking problems are decidable for the 
class STSS of symbolic transition systems. 

Remark: Ltl model checking These results suggest, in particular, a symbolic 
procedure for model checking Ltl properties over STSS systems mm . Suppose 
that 5 is a symbolic transition system in the class STSS, and p is an Ltl formula. 
First, convert ^p to a Biichi automaton using a tableau construction, and 
then to an equivalent Lg-formula (introduce one variable per state of B^p). 
Second, run the symbolic semi-algorithm ModelCheck on inputs TZs and ip. It 
will terminate with a representation of the complement of the set of states that 
satisfy p in 5. □ 

3.4 Example: Rectangular Hybrid Automata 

For every rectangular automaton, the (time-abstract) trace-equivalence relation 
has finite index jH K PVt)^ . It follows that the symbolic semi-algorithm Mod- 
elCheck, as implemented in HyTech, decides all Lg and Lg model-checking 
questions for rectangular automata. The rectangular automata form a maximal 
class of hybrid automata in STSS. This is because for simple generalizations of 
rectangular automata, the reachability problem is undecidable IHkFVU?^l . 



Theorem 3C The rectangular automata belong to the class STSS. 
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Fig. 5. Distance equivalence is coarser than trace equivalence 



4 Class-4 Symbolic Transition Systems 

We define two states of a transition system to be “distance equivalent” if for every 
distance d, the same observables can be reached in d transitions. Class-4 systems 
are characterized by finite distance-equivalence quotients. The region algebra of 
a class-4 system has a finite subalgebra that contains the observables and is 
closed under Pre operations. This enables the model checking of all existential 
conjunction- free and universal disjunction-free /r-calculus properties, such as the 
property that an observable can be reached in an even number of transitions. 



4.1 Finite Characterization: Equi-distant Targets 

Definition: Distance equivalence Let 5 be a transition system. Two states 
s and t of 5 are distance equivalent, denoted s =f t, if for every source-s trace 
of S with length n and target p, there is a source-t trace of S with length n and 
target p, and vice versa. The state equivalence =4 is called distance equivalence. 

□ 

Definition: Class STS4 A symbolic transition system S belongs to the class 
STS4 if the distance-equivalence relation =f has finite index. □ 

FigureElshows that distance equivalence is coarser than trace equivalence (s and 
t are distance equivalent but not trace equivalent). It follows that the class STS4 
of symbolic transition systems is a proper extension of STS3. 



4.2 Symbolic State-Space Exploration: Predecessor Iteration 

The symbolic semi-algorithm Closure4 of Figure El computes the subalgebra of 
a region algebra TZs that contains the observables and is closed under the Pre 
operation. Suppose that the input given to Closure4 is the region algebra of a 
symbolic transition system S = (Q,6,R,'~-~',P). For i > 0 and s,t G Q, define 
s ~f t if for every source-s trace of S with length n < i and target p, there is a 
source-t trace of S with length n and target p, and vice versa. By induction it is 
easy to check that for all i > 0, the extension of every region in Ti, as computed 
by Closure4, is a ~f block. Since ~f is as coarse as for all i > 0, and =f is 
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Symbolic semi-algorithm Closure4 

Input: a region algebra TZ = {P, Pre, ■, Diff , Empty). 

To ~ P; 

for i = 0, 1, 2, . . . do 

Ti+i := Ti 

U {Pre(cr) I a G T} 
until Ti+r C 

The termination test C Ti"' is decided as in Figure 0 



Fig. 6. Predecessor iteration 

equal to n{~f I * ^ 0}i if — f IIS'S finite index, then =f is equal to ~f for some 
i > 0. Then, Closure2 will terminate in i iterations. Conversely, suppose that 
Closure4 terminates with ’"Ti+i”' C In this case, if for all regions cr G Ti, we 
have s G '"cr”' iff t G then s =f t. This is because if s can reach an observable 
p in n transitions, but t cannot, then there is a region in Ti, namely, Pre^{p), 
such that s G '”Pre”(p)”' and t ^ Pre^{p)~'. It follows that =f has finite index. 

Theorem 4A For all symbolic transition systems S, the symbolic semi-algorithm 
Closure4 terminates on the region algebra TZs iff S belongs to the class STS4. 

Corollary 4A The =4 (distance) equivalence problem is decidable for the class 
STS4 of symbolic transition systems. 

4.3 Decidable Properties: Conjunction-Free Linear Time 

Definition: Conjunction-free /i-calculus The conjunction-free p-calculus con- 
sists of the /i-calculus formulas that are generated by the grammar 

ip ::= p\x\ipy p\3C)<p\{px: ip) 

for constants p G II and variables x G X. The state logic L 4 consists of the 
closed formulas of the conjunction- free /i-calculus. The state logic Lf) consists of 
the duals of all L 4 -formulas. □ 

Definition: Conjunction-free temporal logic The formulas of the conjunc- 
tion-free temporal logic L 4 are generated by the grammar 

p ::= p\pV p\ 30 T \ ^0<d P \ 30(p, 

for constants p G II and nonnegative integers d. Let S = {Q,S,-,’~-~'^p) be a 
transition system whose observables include all constants; that is, 7T C P. The 
L0fo'''’'^ula p defines the set C Q of satisfying states: 

Ms = 

Ipi V p2js = bijs u |p2]5; 
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PO = {s G <3 I (3t G S{s) : t G pp)}; 
pO<d</j ]5 = {s G <3 I there is a source-s trace of S with 

length at most d and sink in pp}; 
pOt/jp = {sG< 3| there is a source-s trace of S with sink in pp}. 

(The constructor 30<d is definable from 3Q and V; however, it will be essential 
in the dQ-free fragment of Lf we will consider below.) □ 

Remark: Duality For every LPformula ip, the dual formula p is obtained by 
replacing the constructors p, V, dQ, dO<d, and 30 by p, A, VQ, VD<d, and VD, 
respectively. The semantics of the dual constructors is defined as usual, such that 
PP = Q\pp- The state logic Tf consists of the duals of all LPformulas. It 
follows that the answer of the model-checking question for a state s G Q and an 
L 4 -formula Tp is complementary to the answer of the model-checking question 
for s and the LPformula p. □ 

The following facts about the conjunction-free /r-calculus, conjunction-free tem- 
poral logic, and their duals are relevant in our context. First, both L 4 and 
L 4 admit abstraction, and the state equivalence induced by both L 4 and 
is =4 (distance equivalence). It follows that the logic Lg with restricted conjunc- 
tion is more expressive than L 4 , and Lg is more expressive than L^. Second, 
the conjunction- free /i-calculus L 4 is more expressive than the conjunction-free 
temporal logic Lf, and is more expressive than L 4 , both of which also in- 
duce distance equivalence. For example, the property that an observable can be 
reached in an even number of transitions can be expressed in L 4 but not in L2 ■ 

If we apply the symbolic semi-algorithm ModelCheck of Figure |21 to the region 
algebra of a symbolic transition system S and an input formula from L 4 , then 
all regions which are generated by ModelCheck are also generated by the semi- 
algorithm Closure4 on input TZs- Thus, if Closure4 terminates, then so does Mod- 
elCheck. 

Theorem 4B For all symbolic transition systems S in STS4 and every L^- 
formula p, the symbolic semi-algorithm ModelCheck terminates on the region 
algebra TZs o,n-d the input formula p. 

Corollary 4B The and model- checking problems are decidable for the 

class STS4 of symbolic transition systems. 

5 Class-5 Symbolic Transition Systems 

We define two states of a transition system to be “bounded-reach equivalent” 
if for every distance d, the same observables can be reached in d or fewer tran- 
sitions. Class-5 systems are characterized by finite bounded-reach-equivalence 
quotients. Equivalently, for every observable p there is a finite bound Up such 
that all states that can reach p can do so in at most Up transitions. This enables 
the model checking of all reachability and (by duality) invariance properties. The 
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Fig. 7. Bounded-reach equivalence is coarser than distance equivalence 



transition systems in class 5 have also been called “well-structured” !AC,TT96j . 
Infinite-state examples of class-5 systems are provided by networks of rectangular 
hybrid automata. 

5.1 Finite Characterization: Bounded-Distance Targets 

Definition: Bounded-reach equivalence Let 5 be a transition system. Two 
states s and t of 5 are bounded-reach equivalent, denoted s =f t, if for every 
source-s trace of S with length n and target p, there is a source-t trace of S with 
length at most n and target p, and vice versa. The state equivalence =5 is called 
bounded-reach equivalence. □ 

Definition: Class STS5 A symbolic transition system S belongs to the class 
STS5 if the bounded-reach-equivalence relation has finite index. □ 

Figure Cl shows that bounded-reach equivalence is coarser than distance equiva- 
lence (all states Si, for i > 0, are bounded-reach equivalent, but no two of them 
are distance equivalent). It follows that the class STS5 of symbolic transition 
systems is a proper extension of STS4. 

5.2 Symbolic State-Space Exploration: Predecessor Aggregation 

The symbolic semi-algorithm Reach of Figure Elstarts from the observables and 
repeatedly applies the Pre operation, but its termination criterion is more eas- 
ily met than the termination criterion of the semi-algorithm Closure4; that is. 
Reach may terminate on more inputs than Closure4. Indeed, we shall show 
that, when the input is the region algebra of a symbolic transition system 
S = (Q, i5, i?, P), then Reach terminates iff S belongs to the class STS5. 
Furthermore, upon termination, s =f t iff for each observation p £ P and each 
region a £ T[, we have s £ ’~a~' iff t S 



An alternative ch aracteriza tion of the class STS5 can be given using well-quasi- 
orders on states jAC.TT96[ lFF^. A quasi-order on a set A is a reflexive and 
transitive binary relation on A. A well- quasi- order on A is a quasi-order A on A 



A Classification of Symbolic Transition Systems 



31 



Symbolic semi-algorithm Reach 

Input: a region algebra TZ = {P, Pre, And, Diff, Empty). 

for each p £ P do 
To ■.= {p}; 
for i = 0, 1, 2, . . . do 

Ti+i := Ti U {Pre{a) \ a £ p} 

until I G Ti+i} C \ a £ P} 

end. 

The termination test \J{ra^ I <7 G P+i} C \J{ra^ \ (J G Ti} is decided as in 
Figure Q 



Fig. 8. Predecessor aggregation 



such that for every infinite sequence oq, ai, 02 , . . . of elements ai £ A there exist 
indices i and j with i < j and ai :< a j. A set S C A is upward- closed if for all 
b £ B and a G A, if 6 ^ a, then a £ B. It can be shown that if A is a well- 
quasi-order on A, then every infinite increasing sequence Bq C Bi <£ B 2 Q • ■ ■ 
of upward-closed sets Bi C A eventually stabilizes; that is, there exists an index 
f > 0 such that Bj = Bi for all j > i. 

Theorem 5A. For all symbolic transition systems S, the following three condi- 
tions are equivalent: 

1. S belongs to the class STS5. 

2. The symbolic semi- algorithm Reach terminates on the region alge- 
bra TZs- 

3. There is a well- quasi- order A on the states of S such that for all 
observations p and all nonnegative integers d, the set |30<dp]5 is 
upward- closed. 

Proof (2 ^ 1) Define s t if for all observations p, for every source-s trace 
with length n and target p, there is a source-t trace with length at most n 
and target p, and vice versa. Note that has finite index for all n > 0. 
Suppose that the semi-algorithm Reach terminates in at most i iterations for 
each observation p. Then for all n > i, the equivalence relation is equal to 

Since =f is equal to n{~<nl — 0}, it has finite index. 

(1 3) Define the quasi-order s t if for all observables p and all n > 0, for 

every source-s trace with length n and target p, there is a source-t trace with 
length at most n and target p. Then each set |30<dp]5, for an observable p and 
a nonnegative integer d, is upward-closed with respect to . Furthermore, if 
=f has finite index, then is a well-quasi-order. This is because s =f t implies 
s ^5 t\ if there were an infinite sequence sq, si, S2, . . . of states such that for all 
f > 0 and j < i, we have sj ;^f Si, then no two of these states would be =f 
equivalent. 
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(3 2) This part of the proof foll ows immediately from the stabilization prop- 
erty of well-quasi-orders IAC.TT96I . □ 

5.3 Decidable Properties: Bounded Reachability 

Definition: Bounded-reachability logic The bounded-reachability logic 
consists of the L^-formulas that are generated by the grammar 

ip ::= p\p\J p\ 30<d p I 

for constants p G II and nonnegative integers d. The state logic consists of 
the duals of all L^-formulas. □ 

The following facts about bounded-reachability logic and its dual are relevant in 
our context. Both and admit abstraction, and the state equivalence in- 
duced by both and is =5 (bounded-reach equivalence) . It follows that the 
conjunction- free temporal logic L 4 is more expressive than and Lj) is more 
expressive than . For example, the property that an observable can be reached 
in exactly d transitions can be expressed in L2 but not in ■ Since admits 
abstraction, and for STS5 systems the induced quotient can be constructed using 
the symbolic semi-algorithm Reach, we have the following theorem. 

Theorem 5B The and model- checking problems are decidable for the 

class STS5 of symbolic transition systems. 

A direct symbolic model-checking semi-algorithm for and, indeed, L 4 is easily 
derived from the semi-algorithm Reach. Then, if Reach terminates, so does model 
checking for all L 4 -formulas, including unbounded 30 properties. The extension 
to L 4 is possible, because 33 properties pose no threat to termination. 

5.4 Example: Networks of Rectangular Hybrid Automata 

A network of timed automata consists of a finite state controller and an 

arbitrarily large set of identical ID timed automata. The continuous evolution 
of the system increases the values of all variables. The discrete transitions of the 
system are specified by a set of synchronization rules. We generalize the definition 
to rectangular automata. Formally, a network of rectangular automata is a triple 
{C,H,R), where C is a finite set of controller locations, R is a ID rectangular 
automaton, and i? is a finite set of rules of the form r = ((c, c'), ei, . . . , e„), 
where c, c' S C and ei,...,e„ are jumps of H. The rule r is enabled if the 
controller state is c and there are n rectangular automata Ri,...,R„ whose 
states are such that the jumps ei, . . . , Cn, respectively, can be performed. The 
rule r is executed by simultaneously changing the controller state to c' and the 
state of each R^, for 1 < i < n, according to the jump e^. The following result is 
proved in [A. 198] for networks of timed automata. The proof can be extended to 
rectangular automata using the observation that every rectangular automaton 
is simulated by an appropriate timed automaton fHKF V9Rj . 

Theorem 5C The networks of rectangular automata belong to the class STS5. 
There is a network of timed automata that does not belong to STS4. 
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Fig. 9. Reach equivalence is coarser than bounded-reach equivalence 

6 General Symbolic Transition Systems 

For studying reachability questions on symbolic transition systems, it is natural 
to consider the following fragment of bounded-reachability logic. 

Definition: Reachability logic The reachability logic consists of the L^- 
formulas that are generated by the grammar 

ip ::= p\py if\ 



for constants p £ U. □ 

The reachability logic is less expressive than the bounded-reachability log- 
ic L^, because it induces the following state equivalence, =e, which is coarser 
than bounded-reach equivalence (see Figure 0 all states Si, for i > 0, are reach 
equivalent, but no two of them are bounded-reach-equivalent). 

Definition: Reach equivalence Let 5 be a transition system. Two states s 
and t of 5 are reach equivalent, denoted s =f t, if for every source-s trace of S 
with target p, there is a source-t trace of S with target p, and vice versa. The 
state equivalence =e is called reach equivalence. □ 

For every symbolic transition system TZ with k observables, the reach-equivalence 
relation has at most 2^ equivalence classes and, therefore, finite index. Since 
the reachability problem is undecidable for many kinds of symbolic transition sys- 
tems (including Turing machines and polyhedral hybrid automata |ACH+95j L 
it follows that there cannot be a general algorithm for computing the reach- 
equivalence quotient of symbolic transition systems. 
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Abstract. This survey is devoted to some aspects of the “P = NP ?” 
problem over the real numbers and more general algebraic structures. 
We argue that given a structure M, it is important to find out whether 
NPm problems can be solved by polynomial depth computation trees, 
and if so whether these trees can be efficiently simulated by circuits. 
Point location, a problem of computational geometry, comes into play in 
the study of these questions for several structures of interest. 



1 Introduction 



In algebraic complexity one measures the complexity of an algorithm by the num- 
ber of basic operations performed during a computation. The basic operations 
are usually arithmetic operations and comparisons, but sometimes transcenden- 
tal functions are also allowed 1 12212.3126] . Even when the set of basic operations 
has been fixed, the complexity of a problem depends on the particular model of 
computation considered. The two main categories of interest for this paper are 
circuits and trees. In section Eland 0 we present a general framework for studying 
these questions, in the spirit of Poizat’s theory of computation over arbitrary 
structures The focus is therefore on superpolynomial lower bounds for 

decision problems, and in particular for NPM-complete problems. This line of 
research was initiated by Blum, Shub and Smale 0; the main emphasis of their 
paper was on the case where M is a ring (see 0 for a recent account) . 

We will ignore completely the large body of work on (sub)polynomial lower 
bounds, which has been an active area of research for decades ([7| is a compre- 
hensive text on this topic). A consequence of our higher ambition is that we 
actually have very few lower bounds to present. The main result of this type, 
presented in Theorem El was obtained by Meer m for the reals with addition 
and equality. The transfer theorems of section 0 show that there may be good 
reasons for this relative scarcity of definitive results. The bright side of this 
state of affairs is that there are plenty of difficult open problems to capture the 
attention of present and future researchers. 

As explained in section 0 point location, a problem of computational geome- 
try, plays an important role in the proofs of two transfer theorems. The branching 
complexity of point location in arrangements of real or complex hypersurfaces is 
also discussed in that section. The upper bound of TheoremElon point location 
in arrangements of complex hypersurfaces seems to be new; in the real case, this 
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is a recent result of Grigoriev m- Finally, we show in section Elthat a solution to 
the “computation tree alternative” would lead to a much better understanding 
of the “P = NP ?” question over the real and complex numbers. 

2 Computation Models 

2.1 Arbitrary Structures 

We first recall some elementary definitions from logic, which should be familiar 
to most of our readers. By “structure”, we mean a set M equipped with a finite 
set of functions fi : ^ M and relations C A function of arity 0 

is called a constant. We always assume that our structure contains the equality 
relation. 

Terms are built from the basic functions of M by composition. More precisely, 
we have the following inductive definition: 

(i) Variables and elements of M are terms of depth 0; 

(ii) A term of depth d > 1 is of the form where fi is a function 

of M and t\, ... ,tm are terms of maximal depth d — 1. 

A term in which n distinct variables xi, ... ,Xn occur computes a function from 
M" to M. 

An atomic formula is of the form ri(ti, . . . , tmf) where ti, . . . , t^m are terms. 
A quantifier-free formula is a boolean combination of atomic formulas. To be 
completely precise, one can give an inductive definition: 

(i) Atomic formulas are formulas of depth 0. 

(ii) If F" is a formula of depth d — 1, is a formula of depth d. 

(iii) If max(depth(F),depth(G)) = d— 1, FVG and FAG are formulas of depth 

d. 

A formula in which n distinct variables xi, . . . , Xn occur defines a subset of M". 
The elements of M occurring in terms or formulas (or in trees, circuits, etc...) 
are called parameters. 

Example 1. If M is a field, terms of (M, -I-, x , =) represent polynomials; the sets 
defined by quantifier-free formulas are called constructihle. If M is an ordered 
field, the definable sets of (M, -I-, x, <) are by definition the semi-algebraic sets. 

An existential formula is of the form 3x\ 3x2 • ■ ■ 3xn F, where F is a quantifier- 
free formula. M is said to admit quantifier elimination if every existential formula 
is equivalent to a quantifier-free formula (of course, as long as complexity issues 
are not taken into account, it is sufficient to consider the case n = 1). As ex- 
plained in section 0 this notion plays an important role in the study of the 
“Pm = NPm ?” problem. 
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2.2 Trees 

The simplest and most powerful computation model that we shall consider is the 
branching tree. A branching tree with n input variables xi, . . . , Xn recognizes a 
subset of M” in the following way. Each internal node g is labeled by some atomic 
formula Fg(xi, . . . , Xn) of M , and has two children. If the input satisfies Fg we 
go left, otherwise we go right. Leaves are labeled accept or reject. Alternatively, 
leaves could be labeled by terms of M. The tree would then compute a function 
from M" to M. As with other tree models, complexity is measured by depth, i.e., 
the branching complexity of a subset of M" is the smallest depth of a branching 
tree that recognizes it. 

The branching tree model is obviously not very realistic since any term, no 
matter how complex, can be evaluated for free. Consequently, this model is per- 
haps more suitable for proving lower bounds than upper bounds (some upper 
bounds are nevertheless presented in section lb. 211 . Lower bounds for approxi- 
mating roots of complex polynomials in the structure (IR, , x,/,<) can be 
found in P2HH. In this context, the terms topological complexity and topological 
decision tree have been used instead of branching complexity and branching tree. 

The computation tree is a more realistic model in which the complexity of 
terms is taken into account. In addition to branching nodes, a computation tree 
has unary computation nodes. A computation node g computes a term of the 
form f.i(ti, . . . ,t„.) where fi is a function of M and are variables, 

parameters from M or terms computed by computation nodes located between 
the current node g and the root of the tree. The branching nodes of a compu- 
tation tree are labeled by atomic formulas of the form (ti , . . . , ) where ri is 

a relation of M and again ti, . . . , are variables, parameters from M or terms 
computed by computation nodes located between the current node and the root 
of the tree. 

Note that a computation tree is nothing but a special way of representing a 
formula, and therefore the subsets of M” accepted by computation trees (or by 
branching trees, or by circuits as defined in section are simply the definable 
sets. One may again argue that the computation tree model is mostly suitable 
for lower bounds as it is still too powerful. Note for instance than in the standard 
structure M = {0, 1} any subset of M” can be recognized in depth n. As far as 
we are concerned, the “fully realistic” model is the circuit model of section |^| 
Computation trees are nevertheless suitable for upper bounds if preprocessing is 
allowed. That is, if we have to recognize very quickly a subset X of M" for some 
fixed value of n, we might first spend a lot of time (and space) to construct a 
computation tree recognizing X. The depth of this tree is a reasonable measure 
of the number of elementary operations needed to decide whether an element of 
M” belongs to X. Preprocessing is a quite common technique in computational 
geometry, see for instance HZ!. 
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2.3 Circuits 

The input gates of a circuit are labeled by variables or parameters from M . 
There are several types of computation gates in a circuit over M: 

1. For each function fi of M, gates of type fi apply this function to their rii 
inputs. 

2. For each relation of M, gates of type apply the characteristic function 
of Ti to their inputs. For this definition to make sense, we assume that M 
contains two distinguished elements called 0 and 1. 

3. Finally, selection gates compute a function s{x,y,z) of their three inputs 
such that s(0,y, z) = y and s{l,y,z) = z. The behaviour of s on an input 
(x,y,z) with x^{0,l} is not important. We shall assume that s{x,y,z) is 
equal in this case to some fixed term t{x, y, z) of M. 

A circuit is parameter- free if it uses only the parameters 0 and 1. Strangely 
enough, selection gates already appear in in a context where they are not 
really needed. Indeed, in any field the term t(x, y, z) = xz-l-(l — x)y is a selection 
function. 

The number of computation gates in a circuit is its size. A circuit C with n 
input variables and m output gates computes a function from M” to M"*. We are 
mostly interested in the case m = 1. For such a circuit, the set of accepted inputs 
is by definition the set of inputs (a;i, . . . , a;„) G M” such that C{xi , . . . , x„) = 1. 
As mentioned before, circuits with n input variables accept exactly the definable 
subsets of M”. 

Computation trees are at least as powerful as circuits since a circuit of size s 
can be simulated by a computation tree of depth 0(s). As we shall see in sec- 
tion 0 the converse is not so clear. 

3 Complexity Classes 

Our main complexity classes, such as P and NP, will be defined in terms of 
circuits. Formally, a problem is simply a subset of M°° = M”. We first 

define the class Pm of non-uniform polynomial-time problems. A problem X C 
M°° is in Pm if there exists parameters oi, . . . , Op G M, a polynomial p{n) and 
a family of parameter-free circuits {Cn)n>i where has n-\-p inputs, is of size 
at most p(n) and satisfies the following condition: 

'ix G M” X € X Cn(ai . . . ap, xi, . . . , Xn) = 1 (1) 

NPm is the non-deterministic version of Pm- That is, a problem X is in NPm 
if there exists a polynomial g(n) and a problem F" G Pm such that for any n > 1 
and any x G M”, 

xeX^^Bye (x, y) G Y. (2) 

Several equivalent choices for the pairing function (., .) : x M°° M°° are 

possible, for instance: 



((xi, . . . , a:„), (yi, . . . , 2 /m)) = (0, xx, 0 ,X 2 ,..., 0, a;„, 1, yx, 1, j/ 2 , • ■ ■ , 1, Vm)- 



Circuits versus Trees in Algebraic Complexity 



39 



One could also choose q so that the map n i— > n + q{n) is injective, and simply 
concatenate x and y. 

A problem X is in the class Pm of polynomial-time problems if A G Pm 
and the corresponding circuit family (C„) in Q is uniform in the following 
sense: there exists a (classical) Turing machine which on input n constructs 
Cn in time polynomial in n. This definition makes sense since a parameter-free 
circuit is a purely boolean object. To be completely precise one should specify 
how circuits are encoded in binary words ISHI; there is no significant difference 
with the classical case p. This complexity class can also be defined with Turing 
machines over M instead of circuits Ilbl35l . In the case where M is a ring, these 
Turing machines are equivalent to Blum-Shub-Smale machines 0. 

The class NPm of non-deterministic polynomial time problems is obtained 
from Pm in the same way as NPm is obtained from Pm: just replace the condition 
Y G Pm in (EJ by F G Pm- 

Example 2. For the standard structure M = {0, 1} the usual complexity classes 
are recovered: Pm = P, NPm = NP, Pm = P/poly, NPm = NP/poly. 

For M = (K, -I-,— , x,<) or M = (K, -b, — , <), Pm = Pm and NPm = NPm 
since circuit families can be encoded in parameters. 

One can also define the class of parameter-free non-uniform polynomial time 
problems (set p = 0 in (P)), and from this class we obtain the classes NP^, P^^ 
and NP^, which are the parameter-free versions of NPm, Pm and NPm- These 
parameter- free classes appear in section El 

The problems “Pm = NPm ?” and “Pm = NPm ?” are open for most 
structures of interest (but for any structure Af, the latter equality implies the 
former). The main open problem in this general theory is whether there exists 
a structure M satisfying Pm = NPm- 

There is more than P and NP in the world of structural complexity, and 
lots of classical complexity classes can be redefined in our general framework. 
For instance, the reader can easily imagine how the A:-th levels and 
of the uniform polynomial hierarchy are defined piriTj ( replace the existential 
quantifiers in (El by A: alternating quantifier blocks). Of course, there is also a 
non-uniform polynomial hierarchy. 

One of the main differences with the classical theory comes from the role 
of space: as shown by Michaux a Turing machine over (ffi, -b, — , x, <) can 
perform any computation in constant work space. The classical definition of 
PSPACE is therefore of little use in our general framework. We will instead 
work with the class PARm of parallel polynomial time problems. It is again 
easier to first define as in |S| a non-uniform version of this complexity class: a 
problem X is PAIRm if there exist parameters ai,...,Q!p G M, a polynomial 
p{n) and a family of parameter-free circuits {Cn{xi, . . . , Xn, yi, ■ ■ ■ , yp)) such 
that Cn(xi , . . . , Xn, oi, - - - , ctp) solves X for inputs in M", and is of depth at 
most p{n). We say that X is PARm if there exists a Turing machine which, on 
input n, construct Cn in work space polynomially bounded in n. 

Example 3. For M = {0,1}, PARm = PSPACE; for M = (K, -b,— , x,<), our 
definition is equivalent to those of El und EH. 
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4 Is P Equal to NP ? 

If one wishes to investigate the “Pm = NPm ?” in some structure M, the first 
thing to do is to find out whether M admits quantifier elimination. Indeed, 
if M does not admit quantifier elimination NPm problems cannot be solved 
by circuits (or equivalently by quantifier-free formulas) of any size. One should 
therefore consult one’s favourite model theory book (e.g. mm for a list of 
structures that admit or do not admit quantifier elimination. For instance, the 
following structures admit quantifier elimination: 

(i) vector spaces over Q, e.g., = (IR, -I-, — , =); 

(ii) ordered vector spaces over Q, e.g., Rons = (R, 

(iii) algebraically closed fields of any characteristic, e.g., C = (C, -I-,— , x,=); 

(iv) real closed fields, e.g., R = (R, , x, <); 

(v) differentially closed fields of characteristic zero (of which we can give no 
natural example). 

The reals with exponentiation (R, -I-, — , x , exp, <) and the integers (N, -I-, — , x , 
<) do not admit quantifier elimination. However, the “NPm = coNPm ?” ques- 
tion makes sense in the former structure since it is model-complete (i.e., every 
existential formula is equivalent to a universal formula) . 

For structures (i) through (iv), elimination of quantifiers can be performed in 
a relatively efficient manner in the sense that NPm C PARm- Proofs of this fact 
for M = R„s and M = Ro„s can be found in |2S| (these two structures satisfy 
the stronger property that NPm = BNPm, i-C., existential quantification over 
M is polynomially equivalent to existential quantification over {0, 1}). For alge- 
braically closed fields one may consult fTTlTTnj and for real closed fields 0IBEZ|. 
The complexity theory of differentially closed fields is not so well understood. 
One may cite a triply exponential quantifier elimination algorithm m and a 
study of the “P = NP ?” problem for these structures in m- 

Quantifier elimination is even more closely connected to the “Pm = NPm?” 
question than explained at the beginning of this section. Roughly speaking. 
Pm = NPm means that existential quantifiers can be eliminated in polynomial 
time, if we represent the eliminating formula by a circuit: 

Theorem 1. Pm = NPm if o,nd only if there exist parameters ai, . . . , ak of 
M and a polynomial time algorithm (in the classical sense) which, given a 
parameter-free existential formula 3yi • ■ ■ 3pp F(xi, • • • , Xn, yi, ■ ■ ■ , Vp), outputs 
a circuit C over M such that C{xi , . . . , Xn, ai, ■ ■ ■ , Ofe) is equivalent to the exis- 
tential formula. 

The “Pm = NPm ?” question can therefore be viewed as a purely classical 
question, that is, a question about algorithms over the structure {0, 1}. This 
theorem remains true if we replace the classical elimination algorithm by an 
algorithm over M, which is the way it is stated in PSJ. 

We have precious few examples of structures that admit quantifier elimination 
but where Pm is provably different from NPm - The main example is due to Klaus 
Meer pil] : 
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Theorem 2. ^ NPr^^. 

Another example is the structure of arborescent dictionaries constructed in uni 
(see also ESI). Meer’s proof of Theorem |2| is based on a multidimensional ver- 
sion of the knapsack problem. The proof given below is based on Shub and 
Smale’s m Twenty Questions (TQ) problem. An input {xi, . . . , Xn) is in TQ if 
x\ is an integer between 0 and 2" — 1 {x\ is therefore the only truly “numerical” 
input; the only role of a; 2 , . . . , is to specify the value of n). 

Twenty Questions is in NPr^^ since {x\, . . . ,Xn) € TQ if and only if 

n 

1 , . . . , lln C {0,1} X\ ^ ^ 2 Hi . 

i=l 

However, TQ is not only outside Pr„^ but its branching complexity is exactly 
equal to 2”. Indeed, this problem can obviously be solved in depth 2" (just 
perform the 2” tests “xi = 0 ?”, ...,“x„ = 2” — 1 ?” sequentially). For the 
converse, let be a branching tree of depth d which solves Twenty Questions 
for inputs of the form X 2 = 0, ... ,Xn = 0. This tree has a single real- valued input 
xi and recognizes a finite subset of K, of cardinality 2". We claim that d > 2". 
Each internal node of performs a test of the form “/(xi) = 0 ?” where I is an 
affine function; we may assume without loss of generality that I is not identically 
0. The generic path of T„ is obtained by answering no to all these tests. Observe 
that inputs which follow the generic path are rejected since T„ recognizes a finite 
subset of K. Accepted inputs must therefore satisfy one of the test performed 
along the canonical path, which proves the claim, and Theorem |21 

One might be led by the simplicity of this proof to the belief that other 
results of this type should not be too difficult to obtain. For instance, one could 
try to: 

(i) separate higher levels of the polynomial hierarchy over Ki,s ; 

(ii) obtain separation results for richer structures than . For instance, is 
Pr^^^ different from NPr^^^ ? 

It turns out that both questions are quite difficult, however. In the first direction, 
it is actually possible to separate the first few levels of the polynomial hierar- 
chy a variation on the proof of Theorem 0 shows that Twenty Questions 
is not in coNPr^^, which separates NPr^^ from coNPr^^; and a similar problem 
can be used to separate ^r^^ H Hr^^ from 27^^^ U TTr^^- Separating higher levels 
is essentially “impossible” , as shown by a transfer theorem of m-- 

Theorem 3. For all k > 0: 

PH = ^ phr^, = 

We showed in the same paper that parallel polynomial time cannot be separated 
from the polynomial hierarchy: 



Theorem 4. P = PSPACE ^ PARr„^ = n 7^ 
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and that P cannot be separated from NP n coNP: 

Theorem 5. P = NP => Pr^^ = ^r^^ H -^r^^- 

Bourgade has extended theorems 0 and 0 to infinite Abelian groups of prime 
exponent |E|- 

Twenty Questions cannot be used to separate P from NP in Ko„s since this 
problem is in Pr„^, by binary search. In fact, obtaining a proof of this separation 
is all but hopeless by another transfer theorem from HS|: 

Theorem 6. P = NP => Pr^„, = NPr^^^. 

In a previous paper m we obtained a similar result for the “P = PAR ?” 
problem: 

Theorem 7. P = PSPACE Pr„^, = PARr^^,. 

Theorems 01 through 0 show that a number of separations between real com- 
plexity classes are (at least) as hard to prove as outstanding conjectures from 
discrete complexity theory. Yet we know that these separations must hold true 
since an equality of two real complexity classes would imply the equality of 
the corresponding discrete complexity classes. This follows from the “boolean 
parts” results of [2,41 1 2) . For instance, = 77^ would imply = 7T^, and 
Pro„, = NPro„s would imply P/poly = NP/poly. Non-uniformity comes into 
play in the latter statement only because arbitrary real parameters are allowed: 
P /poly can be replaced by P and NP /poly by NP if we work with parameter-free 
machines, or in the structure Qous instead of IRous • 

We will see in the next section that Theorems 0 and 0 hinge on the following 
question: 

Question 1. Are computation trees more powerful than circuits ? 

If this deliberately fuzzy question is interpreted in its broadest sense, it is 
possible to give a positive answer for many structures of interest. For instance, 
as pointed out before, in the standard structure M = {0,1} any boolean func- 
tion of n variables can be computed by a tree of depth n, but a simple counting 
argument shows that most boolean functions have exponential circuit complex- 
ity H- Bounds on the number of consistent sign conditions (a la Thom-Milnor, 
see end of section 1,4. 1 [) show that most boolean functions also have exponential 
circuit complexity over M and C, even if arbitrary real or complex parameters are 
allowed |2H|. It is nevertheless possible to construct a special-purpose structure 
in which polynomial size computation trees are equivalent to polynomial size 
circuits (hint: try to encode a computation tree in a circuit’s parameter; this can 
be done efficiently with Poizat’s arborescent dictionaries). 

This question becomes more interesting if we compare the power of trees 
and circuits on a restricted class of problems, for instance on NPm problems. In 
this case the situation is dramatically different since, as pointed out by Poizat 
(personal communication), we do not have a single example of a structure M 
where NPm problems can be solved by polynomial depth computation trees, but 
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Pm ^ NPm- For instance, computation trees over Mo„s are likely to be more 
powerful than circuits on problems (in fact, as explained in the next 

section any such problem can be solved polynomial depth computation trees) 
but by Theorem ini proving this is essentially “impossible.” 

Shub and Smale’s invention of Twenty Questions was motivated by an at- 
tempt to separate Pc from NPc- The plausible number-theoretic conjecture that 
“A:! is ultimately hard to compute” was put forward in and was shown to 
imply that Twenty Questions is not Pc- Since Twenty Questions is NPc, a proof 
of this conjecture would indeed separate Pc from NPc- We point out however 
that Shub and Smale’s proof of this implication is based on a canonical path 
argument similar to the argument of Theorem (but more involved). As a re- 
sult, the conjecture in fact implies that Twenty Questions cannot be solved by 
polynomial depth computation trees. Although of great interest, its proof would 
therefore shed no light on Question QJ 

5 Complexity and Point Location 

5.1 Point Location by Computation Trees 

There are two main steps in the proofs of Theorems 0 and Q 

(i) Show that any NPr^^^ (or PARr^^J problem can be solved by polynomial 
depth computation trees. 

(ii) Show that under an appropriate complexity-theoretic assumption (P = NP 
or P = PSPACE), these trees can be “transformed” into polynomial size 
circuits. 

Two proofs of (i) are known. The first one is essentially due to Meyer auf der 
Heide mm . In fact this author showed that PARr^^^ problems can be solved by 
Mods - branching trees of polynomial depth. In order to obtain polynomial depth 
computation trees, we just have to check that the coefficients of affine functions 
labeling the nodes of these branching trees are integers of polynomial size cni 
Meyer auf der Heide’s proof exploits the geometrical structure of a PARr^^^ 
problem: its restriction to inputs in IR" is a union of faces of an arrangement of 
2"°*' ' hyperplanes. We recall that any finite set of m hyperplanes of equations 
hi{x) = 0, . . . , hm{x) = 0 partitions IR" in a finite number of faces, where each 
face is the set of points satisfying a system of equations of the form 

sign(/ii(a;)) = ei, . . . , sign(ft.^(a;)) = Cm (3) 

for some fixed sign vector (ei, . . . , Cm) G {—1,0,1}’". In order to decide whether 
an input point should be accepted we just need to locate it in the arrangement, 
that is, to determine to which face it belongs. Meyer auf der Heide P2| asked 
whether any union of m hyperplanes of IR" can be recognized by Koi,s -branching 
trees of depth (nlogmj'^F), jjis construction does not quite yield that result 
because it uses certain bounds on the size of the hyperplanes’ equations. In in 
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we used a construction of Meiser m to give a second proof of (i); this con- 
struction also yields a positive answer to Meyer auf der Heide’s question. In 
fact Meiser’s construction almost answers that question, except that he has a 
non-degeneracy assumption on the arrangement, and that he allows multiplica- 
tions in his computation model (that is, he works with IR-branching tree instead 
of Mods - branching trees). These problems can be fixed as explained in ^1|. To 
complete the second proof of (i) we also need to analyze the size of coefficients 
in the resulting -branching trees; this is done in the same paper. 

The transformation of polynomial depth trees into polynomial size circuits 
under the assumption P = PSPACE is based on an exhaustive search procedure 
among all polynomial depth trees. This procedure can be made to run in poly- 
nomial space, and thus in polynomial time under the assumption P = PSPACE 
(see m or section El). In order to use the weaker assumption P = NP, the mere 
knowledge that PARr^^^ can be solved by polynomial depth trees is not suffi- 
cient: we also need to know how these trees are constructed. We chose to work 
with Meyer auf der Heide’s construction because it yields a stronger result than 
Meiser’s. Namely, we obtained the unconditional result that NPr^^^ problems 
can be solved by Pr„„, algorithms with the help of a boolean NP oracle uni; 
with Meiser’s construction one would obtain an oracle in a higher level of the 
polynomial hierarchy. 

Nothing like TheoremElor TheoremQis known if we replace Rons by R or C. 
It is not even known whether NPr or NPc problems can be solved by polynomial 
depth computation trees (as shown in sectional this question is very much related 
to the “P = NP ?” problem over the real and complex numbers). In fact, it is 
natural to conjecture that Twenty Questions cannot be solved by polynomial 
depth computation trees (as pointed out in section^ this would follow from the 
conjecture that kl is ultimately hard to compute). The bold conjecture that all 
PARr problems can be solved by polynomial depth computation tree was put 
forward in P!- Like in the Rous case, there is an intimate relationship between 
PARr problems and point location: the restriction to R" of such a problem is a 
union of faces of an arrangement defined by m = 2" ' ' polynomials /i , . . . , fm 
of degree . Faces are defined as in the Rous case: just replace hi,. . . ,hm 
in @ hy fi, , fm- A similar property is also true of PARc problems. Here we 
need to redefine the sign function so that it only takes the values 0 (if its input 
is equal to zero) or 1 (if it is nonzero) . As explained in m, point location in an 
arrangement defined by m real polynomials of fixed degree in n variables can be 
performed by a computation tree of depth polynomial in n log m by reduction 
to the linear case. 

At least we know that PARr and PARc problems have polynomial branching 
complexity. In the real case, this follows from the well-known result that m 
polynomials of degree d in n variables define an arrangement with (md)*^*^") 
faces (see |2| for a sharper bound) and from a recent result of Grigoriev ^0] who, 
answering a question of P!> showed that point location in an arrangement with 
N faces has branching complexity 0(log N). The (md)^^"^ bound on the number 
of faces still holds in the complex case, and point location in an arrangement 
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with N faces has branching complexity 0(n log iV) as explained in section IFT^I 
This shows that PARc problems indeed have polynomial branching complexity. 

5.2 Branching Complexity of Point Location 

Recall that a tree solves the point location problem for a given arrangement if two 
input points arriving at the same leaf of the tree always belong to the same cell. 
In this section we prove the 0(n log TV) upper bound on the complexity of point 
location in the complex case, and give some hints for the proof of Grigoriev’s 
O(logfV) bound. In fact, his proof applies not only to K but to any ordered 
field. Likewise, we state and prove our 0{nlogN) bound not just for the field of 
complex numbers, but for an arbitrary field. 

We need the following fact for the proof of Theorem 0 (see |3| for a construc- 
tive version). 

Proposition 1. Let K be an infinite field and V a variety of defined by 
polynomials /i, . . . , /s G K[Xi , . . . , Xn] of degree at most d. This variety ean be 
defined by n + 1 polynomials of K[Xi , . . . , Xn] of degree at most d. 

Proof. We shall see that V can be defined by n-l- 1 “generic” linear combinations 
of the input equations. More precisely, for a matrix a = (oy )i<i<n-i-i,i<j<s of 
elements of K, let us denote by Va the variety of iL" defined by the polynomials 
ffi = O’b/f (1 < * < !)• Obviously, V C Va for any a. It turns out that 

the converse inclusion holds for “most” a’s. 

Let p be the characteristic of K and Fp the prime field of characteristic p 
(i.e., Fq = Q and Fp = TLjpTL for p > 2). We shall assume for now that K is 
algebraically closed, of infinite transcendence degree over Fp. 

The coefficients of the ffs lie in a subfield k C K of finite transcendence 
degree. We claim that V = Va V the entries of a are algebraically independent 
over k. Assume indeed that X)j=i = 0 for i = 1, . . . , n-l-1. Since the tuple 

x~a is of transcendence degree at least s(n-|- 1) — n over k{x), any transcendence 
base of k(x,a) over k{x) which is made up of entries of a must contain at 
least one row of a. If i is such a row, the equality = 0 implies 

that fj{x) — 0 for all j since fj{x) G k{x) and an, . . . ,ais are algebraically 
independent over that field by choice of i. We therefore conclude that x G V, 
and that Va QV since x was an arbitrary point of Va . 

To complete the proof of the proposition, we just need to remove the assump- 
tion on K. In the general case, K can be embedded in an algebraically closed 
field K of infinite transcendence degree. The ffs define a variety V of AT”, and 
likewise for any matrix a with entries in K, the giS define a variety Va of AT” 
which contains V . Let G be the (constructible) set of all a G such that 

V = Va. We have seen that a G G if its entries are algebraically independent 
over k. This implies that G is dense in AT®("+i). It follows that for any infinite 
subset E of AT, and in particular for E = K, there exists a matrix a G G with 
entries from E (this can be proved by induction). For such a matrix, V = Va 
since V = Va. 
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Theorem 8. Let T = {/i, . . . , fs} be a family of polynomials of K[Xi, . . . , Xn\, 
where K is an arbitrary field. The point location problem for T has branching 
complexity 0{nlogN), where N is the number of cells in the arrangement. 

Proof. We will assume that K is infinite since the result is trivial for finite fields 
(in that case, any function on can be computed in depth 0(n)). 

For any subset / of [n] = {1, . . . , n}, we denote by Cj the sign condition 

ifi = 0)iei A (/i yf 0)i^i. 

We say that Cj is feasible if the set of points satisfying this sign condition is 
nonempty (in which case it is a cell of the arrangement). We shall also denote by 
< a (fixed) total order on the subsets of [n] which is compatible with inclusion 
(i.e., I C J implies I < J). Our construction is based on these two observations: 

1. For any / C [n], V/ = (J Cj is a variety of /f": it is the union for J > I 
of the varieties {fi = 0)igj. 

2. Any variety of iF" can be defined by n + 1 equations, as shown in Proposi- 
tion QJ 

Note that J < J if and only if Vj C Vj. Let us say that I C [n] is feasible if 
Cj is a feasible sign condition. On input x S AT", our point location algorithm 
finds by binary search the largest feasible I such that x G Vj. Since x satisfies 
C/, this is the desired sign condition. 

There are N satisfiable sign conditions, so that we only need log N steps of 
binary search. At each step the algorithm performs a test of the form: “x G Vj ?” . 
By the second observation, this test can be performed in depth n-l-1. The overall 
depth of the corresponding tree is therefore 0(nlog X). 

Note that log X is an obvious lower bound on the branching complexity of 
point location (a binary tree with at least X leaves must have depth at least 
log A^), so that the 0(n log TV) of Theorem 0 is not too far from the optimum. 

If K is the field of real numbers (or more generally is an ordered field), the 
log lower bound is in fact tight. Indeed, we no longer need n -|- 1 polynomials 
to define a variety as in Proposition Q by the usual “sum of squares” trick, a 
single polynomial suffices (see I^Oj for a more direct proof) . The construction of 
Theorem [^therefore yields a branching tree of depth 0(log X). As shown in m, 
it possible within the same depth not only to find out whether each polynomial 
fi is zero or non-zero at the input point, but to find out whether it is positive 
or negative: 

Theorem 9. Let T = {/i, . . . , fs} be a family of polynomials of i?[Ai, . . . , A„] 
where R is an ordered field. The point location problem for T has branching 
complexity OflogX), where X is the number of cells of the arrangement. 

The proof of Theorem El relies on a two-stage construction. In the first stage 
we determine for each fi whether it vanishes at the input point x. This can be 
done in depth OilogX) as explained before Theorem |3 In the second stage. 
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we determine the sign of fi{x) for each polynomial fi which does not vanish 
at X. The proof that this can also be done in depth 0{logN) relies on a nice 
combinatorial lemma: 

Lemma 1. Let ui, . . . ,Um be pairwise distinet veetors of (fLjTL')^. If m > 6 
there exists a veetor v G (Z/2Z)^ sueh that 

m/3 < |{1 < i < m; (v,Ui) = 0}| < 2m/3. 

For the proof of this lemma and its application to Theorem|3 we refer the reader 
to j2()] . 

6 The Computation Tree Alternative 

In this section we show that Question 0 plays a crucial role in the study of the 
“P = NP?” problem over the real and complex numbers. The proofs are based 
on techniques from HH. 

The NPc-complete HNc appears in the statement of our computation tree 
alternative for the complex numbers. This is the problem of deciding whether a 
system of polynomial equations /i(xi, . . . , Xn) = 0, . . . , fs{xi , . . . , Xn) = 0 in n 
complex variables has a solution. 

Theorem 10. If HNc can be solved by a family of parameter-free eomputation 
trees of polynomial depth we have the following transfer theorem: P = PSPACE 
implies Pc = NPc. Otherwise, Pc yf NPc. 

Proof. If HNc cannot be solved by a family of parameter-free computation trees 
of polynomial depth then this problem cannot be solved by a family of parameter- 
free circuits of polynomial size. By elimination of parameters ISEE71 - this implies 
that Pc yf NPc. 

It is well known that the field of complex numbers satisfies the hypothesis 
of Theorem O below IT^ . If P = PSPACE and HNc can be solved by a family 
of parameter-free computation trees of polynomial depth, HNc is therefore Pc 
(even P^) by Theorem El and thus Pc = NPc. 

Theorem eg can be interpreted as follows. If we can prove (unconditionally) 
that HNc cannot be solved by a family of parameter-free polynomial depth 
computation trees, we have obtained an unconditional separation of Pc from 
NPc. If HNc can be solved by a family of parameter-free polynomial depth 
computation trees. Pc is still very likely to be different from NPc (otherwise, NP 
would be included in BPP HUSH!). However, obtaining a proof of this separation 
would be a hopeless problem, at least in the current state of discrete complexity 
theory. 

For the reals we can replace HNc by the NPR-complete problem 4 FEASr: 
this is the problem of deciding whether a polynomial of degree at most 4 in n 
real variables has a root. We can only deal with parameter-free algorithms since 
elimination of parameters is not known to hold in the real case (see jSj for some 
results in this direction and more references). 
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Theorem 11. // 4FEASr can he solved by a family of parameter-free compu- 
tation trees of polynomial depth we have the following transfer theorem: P = 
PSPACE implies Pr = NP^ (which implies Pr = NPr/ Otherwise, Pr ^ NP^. 

Proof. It is similar to the proof of Theorem nn (the decision algorithm of PI 
can be replaced by the algorithm of [3 or PI)- 

In the proof of Theorem El it is convenient to work with branching trees of a 
restricted form instead of computation trees. That is, we will use the observation 
that a computation tree of depth d can be simulated by a branching tree of depth 
at most d whose nodes are labeled by atomic formulas rifti, , tmt) satisfying 
the following condition: ti, . . . can be computed by straight-line programs 
of size at most d. We recall that a straight-line program over M is a circuit in 
which all gates are labeled by functions of M (relation and selection gates are 
not used). 

Theorem 12. Let M be a structure whose parameter-free formulas ean be 
decided in polynomial space. Let X be a NP^ problem which can he solved by a 
family of parameter- free computation trees of polynomial depth. IfP = PSPACE 
this problem is P^/. 

For the proof of this theorem we need to associate to A a boolean problem X. 
An instance of X is described by three integers n, L, d (written in unary), and by 
a conjunction F{x\, . . . ,Xn) of (parameter free) atomic formulas. For the terms 
in F we assume a straight-line representation. This conjunction defines a subset 
Sf of IR". An instance is positive if there exists a branching tree T satisfying 
the following properties: 

(i) T is of depth at most d and solves X for inputs in Sp. 

(ii) The nodes of T are labeled by atomic formulas of the form ri(t\, . . . ,traf) 
where the terms ti,...,tmi are computed by straight-line programs of 
length at most L. 

We need an algorithm to solve X, and for positive instances of this problem we 
also need to compute the label Ir of the root of a corresponding tree T (this tree 
might not be unique, but any solution will do). Thus A is just a boolean value 
if T is reduced to a leaf, and an atomic formula otherwise. 

Lemma 2. Lf X G NP^ then X G PSPACE. Moreover, for a positive instance 
Ir can be constructed in polynomial space. 

Proof. We first determine whether T can be of depth 0, i.e., reduced to a leaf. In 
that case, T recognizes either K" or 0 depending on the label of that leaf. Label 
0 is suitable if the formula 



3a: G IR" F{x) A {x G X) 

is false. By hypothesis on M, this formula can be decided in polynomial 
space (note that we can introduce additional existentially quantified variables 
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in order to move from the straight-line representation of F to the standard 
representation). Label 1 is suitable if the formula 

3a: G IR" F{x) A {x^X) 

is false. If there is a solution in depth 0, we accept the instance of X and output 
the corresponding label. Otherwise, for d > 0 we look for solutions of depth 
between 1 and d (for d = 0 we exit and reject the instance). To do this we 
enumerate (e.g. in lexicographic order) all atomic formulas A(xi, . . . ,x„) where 
the terms in A are given by straight-line programs of length at most L. For each 
such formula we do the following. 

1. Decide by a recursive call whether {n,L,d— l,F{x) A A{x)) is a positive 
instance of X. 

2. Decide by a recursive call whether {n,L,d — l,F{x) A ^A(a;)) is a positive 
instance of X. 

3. In case of a positive answer to both questions, exit the loop, accept (n, L, d, F) 
and output Ir = A. 

The instance is rejected if it is not accepted in the course of this enumeration 
procedure. 

In addition to the space needed to solve the depth 0 case, we just need 
to maintain a stack to keep track of recursive calls. Hence this algorithm runs 
in polynomial space, showing that X G PSPACE. For positive instances, the 
algorithm also outputs Ir as needed. 

Proof fof Theorem,TWi) . For inputs of size n, X can be solved by a computation 
tree of depth bounded by an^, where a and b are constants. The idea is to use 
Lemma |2| to move down that tree. The hypothesis P = PSPACE implies that 
A G P. Moreover, for positive instances Ir can be constructed in polynomial time 
(one should argue that each bit of Ir is in PSPACE, and therefore in P) . Thus we 
set L — d = anf and F = True. By hypothesis (n, L, d, F) is a positive instance 
of X and therefore Ir can be computed in polynomial time. If Ir is a boolean 
value we stop and output that value. Otherwise A is an atomic formula, and 
we can determine in polynomial time whether the input a: G IR" to A satisfies 
A (because straight-line programs can be evaluated efficiently). If so, we set 
F' = F AA{x). Otherwise, we set F' = F A^A{x). In any case, we set d' = d— 1, 
and feed {n, L,d' , F') to the algorithm deciding A. This process continues until 
a leaf is reached. This requires at most an^ steps. 

Bruno Poizat pointed out that it would be natural to work with families of 
computation trees using the same parameters for all input sizes, since we use this 
convention for our polynomial-time algorithms. The reader may try to formulate 
and prove the counterparts of Theorems E3 and mi in that setting. 
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Abstract. Block codes are first viewed as finite state automata repre- 
sented as trellises. A technique termed subtrellis overlaying is introduced 
with the object of reducing decoder complexity. Necessary and sufficient 
conditions for subtrellis overlaying are next derived from the represen- 
tation of the block code as a group, partitioned into a subgroup and its 
cosets. Finally a view of the code as a graph permits a combination of two 
shortest path algorithms to facilitate efficient decoding on an overlayed 
trellis. 



1 Introduction 

The areas of system theory, coding theory and automata theory have much in 
common, but historically have developed largely independently. A recent book^j 
elaborates some of the connections. In block coding, an information sequence of 
symbols over a finite alphabet is divided into message blocks of fixed length; 
each message block consists of k information symbols. If q is the size of the finite 
alphabet, there are a total of distinct messages. Each message is encoded into 
a distinct codeword of n (n > fc) symbols. There are thus q^ codewords each of 
length n and this set forms a block code of length n. A block code is typically used 
to correct errors that occur in transmission over a communication channel. A 
subclass of block codes, the linear block codes has been used extensively for error 
correction. Traditionally such codes have been described algebraically, their alge- 
braic properties playing a key role in hard decision decoding algorithms. In hard 
decision algorithms, the signals received at the output of the channel are quan- 
tized into one of the q possible transmitted values, and decoding is performed 
on a block of symbols of length n representing the received codeword, possibly 
corrupted by some errors. By contrast, soft decision decoding algorithms do not 
require quantization before decoding and are known to provide significant coding 
gains when compared with hard decision decoding algorithms. That block codes 
have efficient combinatorial descriptions in the form of trellises was discovered in 
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1974 Two other early papers in this subject were m and El- A landmark 
paper by Forney Q in 1988 began an active period of research on the trellis 
structure of block codes. It was realized that the well known Viterbi Algorithm 
m (which is actually a dynamic programming shortest path algorithm) could 
be applied to soft decision decoding of block codes. Most studies on the trellis 
structure of block codes confined their attention to linear codes for which it was 
shown that unique minimal trellises exist m Trellises have been studied from 
the viewpoint of linear dynamical systems and also within an algebraic frame- 
work HH] Q m Q. An excellent treatment of the trellis structure of codes is 
available in uni 

This paper introduces a technique called subtrellis overlaying. This essen- 
tially splits a single well structured finite automaton representing the code into 
several smaller automata, which are then overlayed, so that they share states. 
The motivation for this is a reduction in the size of the trellis, in order to im- 
prove the efficiency of decoding. We view the block code as a group partitioned 
into a subgroup and its cosets, and derive necessary and sufficient conditions 
for overlaying. The conditions turn out to be simple constraints on the coset 
leaders. We finally present a two-stage decoding algorithm where the first stage 
is a Viterbi algorithm performed on the overlayed trellis. The second stage is an 
adaption of the A* algorithm well known in the area of artificial intelligence. It is 
shown that sometimes decoding can be accomplished by executing only the first 
phase on the overlayed trellis (which is much smaller than the conventional trel- 
lis). Thus overlaying may offer significant practical benefits. Section El presents 
some background on block codes and trellises; section 0 derives the conditions 
for overlaying. Sectional describes the new decoding algorithm; finally section 0 
concludes the paper. 

2 Background 

We give a very brief background on subclasses of block codes called linear codes. 
Readers are referred to the classic text m- 

Let Fq be the field with q elements. It is customary to define linear codes alge- 
braically as follows: 

Definition 1. A linear block code C of length n over a field Fq is a k-dimen- 
sional subspace of an n- dimensional vector space over the field Fq (such a code 
is called an (n,k) code). 

The most common algebraic representation of a linear block code is the gen- 
erator matrix G. A k x n matrix G where the rows of G are linearly independent 
and which generate the subspace corresponding to C is called a generator matrix 
for G. Figured shows a generator matrix for a (4,2) linear code over 7^2 ■ 

A general block code also has a combinatorial description in the form of a 
trellis. We borrow from Kschischang et al [Zj the definition of a trellis for a block 
code. 
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0 110 
10 0 1 



Fig. 1. Generator matrix for a (4, 2) linear binary code 



Definition 2. A trellis for a block code C of length n, is an edge labeled directed 
graph with a distinguished root vertex s, having in-degree 0 and a distinguished 
goal vertex f having out-degree 0, with the following properties: 

1. All vertices can be reached from the root. 

2. The goal can be reached from all vertices. 

3. The number of edges traversed in passing from the root to the goal along any 
path is n. 

4-. The set of n-tuples obtained by “reading off” the edge labels encountered in 
traversing all paths from the root to the goal is C . 

The length of a path (in edges) from the root to any vertex is unique and is 
sometimes called the time index of the vertex. One measure of the size of a trellis 
is the total number of vertices in the trellis. It is well known that minimal trellises 
for linear block codes are unique in] and constructable from a generator matrix 
for the code Q. Such trellises are known to be biproper. Biproperness is the 
terminology used by coding theorists to specify that the finite state automaton 
whose transition graph is the trellis, is deterministic, and so is the automaton 
obtained by reversing all the edges in the trellis. In contrast, minimal trellises for 
non-linear codes are, in general, neither unique, nor deterministic 0. Figure 0 
shows a trellis for the linear code in Figure 0 







s 



0 



S, 



6 



Fig. 2. A trellis for the linear block code of figure 0 with Sq = s and Sg = f 
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Willems m has given conditions under which an arbitrary block code (which 
he refers to as a dynamical system) has a unique minimal realization. 

Biproper trellises minimize a wide variety of structural complexity measures. 
McEliece ini has defined a measure of Viterbi decoding complexity in terms of 
the number of edges and vertices of a trellis, and has shown that the biproper 
trellis is the “best” trellis using this measure, as well as other measures based 
on the maximum number of states at any time index, and the total number of 
states. 

3 Overlaying of Subtrellises 

We now restrict our attention to linear block codes. As we have mentioned ear- 
lier, every linear code has a unique minimal biproper trellis, so this is our starting 
point. Our object is to describe an operation which we term subtrellis overlaying, 
which yields a smaller trellis. Reduction in the size of a trellis is a step in the 
direction of reducing decoder complexity. 

Let C be a linear {n,k) code with minimal trellis Tc- A subtrellis of Tc is a 
connected subgraph of Tc containing nodes at every time index i,Q < i < n 
and all edges between them. Partition the states of Tq into n -|- 1 groups, one 
for each time index. Let Si be the set of states corresponding to time index 
i, and |S'i| denote the cardinality of the set Si. Define Smax = maxid^il). The 
state- complexity profile of the code is defined as the sequence (|5'o|> |5'i|, • • • j'S'nl). 
Minimization of Smax is often desirable and Smax is referred to as the maximum 
state- complexity. Our object here, is to partition the code C into disjoint sub- 
codes, and “overlay” the subtrellises corresponding to these subcodes to get a 
reduced “shared” trellis. An example will illustrate the procedure. 

Example 1. Let C be the linear (4,2) code defined by the generator matrix 
in Figure El C consists of the set of codewords {0000,0110,1001,1111} and is 
described by the minimal trellis in Figure El The state-complexity profile of the 
code is (1, 2,4, 2, 1). Now partition C into subcodes C\ and C 2 as follows: 

C = CiUC2; Cl = {0000,0110}; C 2 = {1001, 1111}; 

with minimal trellises shown in figures EJa) and BKb) respectively. 

The next step is the “overlaying” of the subtrellises as follows. There are 
as many states at time index 0 and time index n as partitions of C. States 
(s2, S2), (s3, S3), (si, s{), (s4, s}) are superimposed to obtain the trellis in Fig- 
ure El 

Note that overlaying may increase the state-complexity at some time indices 
(other than 0 and n), and decrease it at others. Codewords are represented by 
(sq,Sj.) paths in the overlayed trellis, where Sq and s} are the start and final 
states of subtrellis i. Thus paths from sq to S5 and from Sg to S5 represent 
codewords in the overlayed trellis of figure 0 Overlaying forces subtrellises for 
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Fig. 3. Minimal trellises for (a) Ci = {0000, 0110} and (b) C 2 = (1001, 1111} 




subcodes to “share” states. Note that the shared trellis is also two way proper, 
with Smax = 2 and state-complexity profile (2, 1,2, 1,2). 

Not all partitions of the code permit overlaying to obtain biproper trellises 
with a reduced value of Smax- For instance, consider the following partition of 
the code. 



C = CiUC2; Cl = (0000, 1001}; C 2 = (0110, 1111}; 

with minimal trellis T\ and T 2 given in figures EJa) and Qb) respectively. 
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It turns out that there exists no overlaying of T\ and T 2 with a smaller value 
of Smax than that for the minimal trellis for C. 

The small example above illustrates several points. Firstly, it is possible to 
get a trellis with a smaller number of states to define essentially the same code as 
the original trellis, with the new trellis having several start and final states, and 
with a restricted definition of acceptance. Secondly, the new trellis is obtained 
by the superposition of smaller trellises so that some states are shared. Thirdly, 
not all decompositions of the original trellis allow for superposition to obtain 
a smaller trellis. The new trellises obtained by this procedure belong to a class 
termed tail-biting trellises described in a recent paper | 2 |. This class has assumed 
importance in view of the fact that trellises constructed in this manner can have 
low state complexity when compared with equivalent conventional trellises. It 
has been shown HD that the maximum of the number of states in a tail-biting 
trellis at any time index could be as low as the square root of the number of 
states in a conventional trellis at its midpoint. This lower bound however, is not 
tight, and there are several examples where it is not attained. 

Several questions arise in this context. We list two of these below. 

1. How does one decide for a given coordinate ordering, whether there exists 
an overlaying that achieves a given lower bound on the maximum state 
complexity at any time index, and in particular, the square root lower bound? 

2. Given that there exists an overlaying that achieves a given lower bound how 
does one find it? That is, how does one decide which states to overlay at 
each time index? 

While, to the best of our knowledge, there are no published algorithms to 
solve these problems efficiently, in the general case, there are several examples of 
constructions of minimal tailbiting trellises for specific examples from generator 
matrices in specific forms in 0. 

In the next few paragraphs, we define an object called an overlayed trellis 
and examine the conditions under which it can be constructed so that it achieves 
certain bounds. 

Let C be a linear code over a finite alphabet. (Actually a group code would 
suffice, but all our examples are drawn from the class of linear codes.) Let 
Co,Ci,...C/ be a partition of the code C, such that Co is a subgroup of C 
under the operation of componentwise addition over the structure that defines 
the alphabet set of the code(usually a field or a ring), and Ci, . . . C are cosets 
of Co in C. Let Ci = Cq hi where hi,l < hi < I are coset leaders, with Ci 
having minimal trellis T^. The subcode Co is chosen so that the maximum state 
complexity is N (occurring at some time index, say, m), where N divides M 
the maximum state complexity of the conventional trellis at that time index. 
The subcodes Co, Ci, . . . C; are all disjoint subcodes whose union is C. Further, 
the minimal trellises for Co, Ci, . . . C/ are all structurally identical and two way 
proper. (That they are structurally identical can be verified by relabeling a path 
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labeled gig 2 in Co with gi + hi-^ , g 2 + hi^ . . . gn + hi^ in the trellis corre- 
sponding to Co -I- hi where hi = hi^ hi^ . . . hi^.) We therefore refer to Ti,T 2 , .. .Ti 
as copies of Tq. 



Definition 3. An overlayed proper trellis is said to exist for C with respect to 
the partition Co,Ci,...C/ where Ci,0 < i < I are subcodes as defined above, 
corresponding to minimal trellises Tq,Ti, . . .Ti respectively, with Smax{To) = N, 
iff it is possible to construct a proper trellis Ty satisfying the following properties: 

1. The trellis Ty has I + 1 start states labeled [sq, 0, 0, . . . 0], [0, si, 0 . . . 0] . . . [0, 
0, ... 0, Si] where Si is the start state for subtrellis Ti,0 < i < 1. 

2. The trellis Ty has I + 1 final states labeled [fo, 0, 0, ... 0], [0, /i, 0, ... 0], .. . [0, 
0, ... 0, fi], where fi is the final state for subtrellis Ti, 0 < i < 1. 

3. Each state ofTy has a label of the form [po,pi, ■ ■ - Pi] where pi is either 0 or 
a state of Ti,0 < i < 1. Each state of Ti appears in exactly one state ofTy. 

4- There is a transition on symbol a from state labeled [pQ,p\, . . .pi] to [go, 
qi, . . .qi] inTy if and only if there is a transition from pito qi in Ti, provided 
neither Pi nor qi is 0, for at least one value of i in the set {0, 1, 2, . . . ?}. 

5. The maximum width of the trellis Ty at an arbitrary time index i,l < i < 
n — 1 is at most N . 

6. The set of paths from [0, 0, . . . s^, . . . 0] to [0, 0, . . . , /j, . . . 0] is exactly Cj,0 < 
J<h 

Let the state projection of state [po,pi, . . . ,pi, . . . ,pi] into subcode index i be 
Pi A Pi Id and empty if pi = 0. The subcode projection of Ty into subcode index 
i is defined by the symbol |r„|i and consists of the subtrellis of Ty obtained by 
retaining all the non 0 states in the state projection of the set of states into 
subcode index i and the edges between them. An overlayed trellis satisfies the 
property of projection consistency which stipulates that |rt,|i = Ti. Thus every 
subtrellis Tj is embedded in Ty and can be obtained from it by a projection 
into the appropriate subcode index. We note here that the conventional trellis 
is equivalent to an overlayed trellis with M/N = 1. 

To obtain the necessary and sufficient conditions for an overlayed trellis to 
exist, critical use is made of the fact that Co is a group and Ci,l < i < I are 
its cosets. For simplicity of notation, we denote by G the subcode Co and by T, 
the subtrellis Tq. Assume T has state complexity profile (toq, mi, . . . m„), where 
my = rrit = N , and nrii < N for all t < r and i > t. Thus r is the first time 
index at which the trellis attains maximum state complexity and t is the last. 
Note that it is not necessary that this complexity be retained between r and t, 
i.e., the state complexity may drop between r and t. Since each state of Ty is 
an M/N-tuple, whose state projections are states in individual subtrellises, it 
makes sense to talk about a state in Ti corresponding to a state in Ty. 

We now give a series of lemmas leading up to the main theorem which gives the 
necessary and sufficient conditions for an overlayed trellis to exist for a given 
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decomposition of C into a subgroup and its cosets. The proofs of these are 
available in m 

Lemma 1. Any state v ofTy at time index in the range 0 tot—1, cannot have 
more outgoing edges than the corresponding state in T. Similarly, any state v at 
time index in the range r + 1 to n in Ty cannot have more incoming edges than 
the corresponding state in T. 

We say that subtrellises Ty and Tb share a state v of Ty at level i if u has non 0 
state projections in both Tq and Tf, at time index i. 

Lemma 2. If the trellises Ta and Tj, share a state, say v at level i < t then they 
share states at all levels j such that i < j < t. Similarly, if they share a state v 
at level i > r, then they share states at all levels j such that r < j < i. 

Lemma 3. If trellises Ta and Tf, share a state at time index i, then they share 
all states at time index i. 

Lemma 4. IfTa and Tj, share states at levels i—1 and i, then their coset leaders 
have the same symbol at level i. 

We use the following terminology. If h is a codeword say hi/i 2 . . . then for 
i < t, hi_|_i . . .ht is called the tail of h at i; for i > r hy ... hi is called the head 
of h at level i. 

Lemma 5. If Ta and Th have common states at level i < t, then there exist 
coset leaders ha and hb of the cosets corresponding to Ta and Tb such that ha 
and hb have the same tails at level i. Similarly, if i > r there exist ha and hb 
such that they have the same heads at level i. 

Now each of the M/N copies of T has mi states at level i. Since the width of the 
overlayed trellis cannot exceed for 1 < j < n — 1, at least {M/N"^) x mi copies 
of trellis T must be overlayed at time index i. Thus there are at most N/mi (i.e. 
{M / N) / {{M / N'^ X mi))) groups of trellises that are overlayed on one another 
at time index i. From Lemma 5 we know that if S' is a set of trellises that are 
overlayed on one another at level i,i < t, then the coset leaders corresponding to 
these trellises have the same tails at level i. Similarly, if i > r the coset leaders 
have the same heads at level i. This leads us to the main theorem. 

Theorem 1. Let G be a subgroup of the group code C under componentwise 
addition over the appropriate structure, with Smax{Tc) = M , Smax{T) = N and 
let G have M/N cosets with coset leaders ho, h\, . . . hM/N-i- Let t, r be the time 
indices defined earlier. Then G has an overlayed proper trellis Ty with respect to 
the cosets of G if and only if: 

For all i in the range 1 <i <n—l there exist at most N/mi collections of coset 
leaders such that 

(i) If ^ Si i < t, then the coset leaders within a collection have the same tails at 
level i. 

(ii) Ifr < i < n, the coset leaders within a collection have the same heads at level 
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Corollary 1. If M = and the conditions of the theorem are satisfied, we 
obtain a trellis which satisfies the square root lower bound. 

Theorem 1 and corollary 1 answer both the questions about overlayed trellises 
posed earlier. However, the problem of the existence of an efficient algorithm for 
the decomposition of the code into a subgroup and its cosets remains open. In 
the next section we describe the decoding algorithm on an overlayed trellis. 

4 Decoding 

Decoding refers to the process of forming an estimate of the transmitted code- 
word X from a possibly garbled received version y. The received vector y consists 
of a sequence of n real numbers, where n is the length of the code. The soft de- 
cision decoding algorithm can be viewed as a shortest path algorithm on the 
trellis for the code. Based on the received vector, a cost l{u, v) can be associated 
with an edge from node u to node v. The well known Viterbi decoding algorithm 
m is essentially a dynamic programming algorithm, used to compute a shortest 
path from the source to the goal node. 

4.1 The Viterbi Decoding Algorithm 

For purposes of this discussion, we assume that the cost is a non negative number. 
Since the trellis is a regular layered graph, the algorithm proceeds level by level, 
computing a survivor at each node; this is a shortest path to the node from 
the source. For each branch b, leaving a node at level i, the algorithm updates 
the survivor at that node by adding the cost of the branch to the value of the 
survivor. For each node at level i-l- 1, it compares the values of the path cost for 
each branch entering the node and chooses the one with minimum value. There 
will thus be only one survivor at the goal vertex, and this corresponds to the 
decoded codeword. For an overlayed trellis we are interested only in paths that 
go from Si to fi, 0 < i < 1. 



4.2 The A* Algorithm 

The A* algorithm is well known in the literature on artificial intelligence jO] and 
is a modification of the Dijkstra shortest path algorithm . That the A* algorithm 
can be used for decoding was demonstrated in |S|. The A* algorithm uses, in 
addition to the path length from the source to the node u, an estimate h(u, f) 
of the shortest path length from the node to the goal node in guiding the search. 
Let Lt{u, f) be the shortest path length from u to / in T. Let h{u, f) be any 
lower bound such that h{u, f) < LT{u,f), and such that h{u, f) satisfies the 
following inequality, i.e, for u a predecessor of v, l{u,v) -I- h{v,f) > h{u,f). If 
both the above conditions are satisfied, then the algorithm A* , on termination, 
is guaranteed to output a shortest path from s to /. The algorithm is given below. 
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Algorithm A* 

Input : A trellis T = (V,E,l) where V is the set of vertices, E is the set of 
edges and l{u,v) > 0 for edge (u,v) in E, a source vertex s and a destination 
vertex /. 

Output : The shortest path from s to /. 

/* p{u) is the cost of the current shortest path from s to rt and P{u) is a current 

shortest path from s to u */ 

begin 

^^0, 5 ^{4, p(s)^0, P(s)^() 

repeat 

Let u be the vertex in S with minimum value of p(u) + h(u, /). 

if u = f then return P{f)] 
for each (m, v) G E do 
if V ^ S then 
begin 

p{v) ^ min(p(u) + l{u,v),previous{p{v)))] 

if p{v) 7 ^ previous{p{v)) then append {u,v) to P{u) 

to give P{v)-, 

{S) ^ CS) U M; 

end 



forever 

end 



4.3 Decoding on an Overlayed Trellis 

Decoding on an overlayed trellis needs at most two phases. In the first phase, 
a conventional Viterbi algorithm is run on the overlayed trellis . The aim of 
this phase is to obtain estimates /i() for each node, which will subsequently be 
used in the A* algorithm that is run on subtrellises in the second phase. The 
winner in the first phase is either an Sj — fj path, in which case the second 
phase is not required, or an Si — fj path, i j, in which case the second phase 
is necessary. During the second phase, decoding is performed on one subtrellis 
at a time, the current subtrellis, say Tj (corresponding to subcode Cj) being 
presently the most promising one, in its potential to deliver the shortest path. If 
at any point, the computed estimate of the shortest path in the current subtrellis 
exceeds the minimum estimate among the rest of the subtrellises, currently held 
by, say, subtrellis Tk, then the decoder switches from Tj to Tfc, making Tfc the 
current subtrellis. Decoding is complete when a final node is reached in the 
current subtrellis. The two phases are described below. (All assertions in italics 
have simple proofs given in ini). 

Phase 1. Execute a Viterbi decoding algorithm on the shared trellis, and obtain 
survivors at each node. Each survivor at a node u has a cost which is a lower 
bound on the cost of the least cost path from sj to u in an Sj — fj path passing 
through u, 1 < j < N . If there exists a value of k for which an Sk — fk path is 
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an overall winner then this is the shortest path in the original trellis Tq. If this 
happens decoding is complete. If no such Sk — fk path exists go to Phase 2. 

Phase 2 

1. Consider only subtrellises Tj such that the winning path at Tj is an Si — fj 
path with i j (i.e at some intermediate node a prefix of the Sj — fj path 
was “knocked out” by a shorter path originating at Si), and such that there 
is no Sk — fk path with smaller cost. Let us call such trellises residual trellises. 
Initialize a sequence Pj for each residual trellis Tj to the empty sequence. 
Pj, in fact stores the current candidate for the shortest path in trellis Tj. 
Let the estimate h{sj, fj) associated with the empty path be the cost of the 
survivor at fj obtained in the first phase. 

2. Create a heap of r elements where r is the number of residual trellises, with 
current estimate h{) with minimum value as the top element. Let j be the 
index of the subtrellis with the minimum value of the estimate. Remove 
the minimum element corresponding to Tj from the heap and run the A* 
algorithm on trellis Tj (called the current trellis). For a node u, take h(u, fj) 
to be h{si, fj) — cost{survivor{u)) where cost{survivor{u)) is the cost of the 
survivor obtained in the first phase. /i() satisfies the two properties required 
of the estimator in the A* algorithm. 

3. At each step, compare p{u) + h{u, fj) in the current subtrellis with the top 
value in the heap. If at any step the former exceeds the latter (associated with 
subtrellis, say, Tk), then make Tk the current subtrellis. Insert the current 
value of p{u) + h(u,fj) in the heap (after deleting the minimum element) 
and run the A* algorithm on Tk either from start node Sk (if Tk was not 
visited earlier) or from the node which it last expanded in Tk. Stop when 
the goal vertex is reached in the current subtrellis. 

In the best case (if the algorithm needs to execute Phase 2 at all) the search will 
be restricted to a single residual subtrellis; the worst case will involve searching 
through all residual subtrellises. 

5 Conclusions 

This paper offers a new perspective from which block codes may be fruitfully 
viewed. A technique called subtrellis overlaying is proposed , which reduces the 
size of the trellis representing the block code. Necessary and sufficient conditions 
for overlaying are derived from the representation of the code as a group. Fi- 
nally a decoding algorithm is proposed which requires at most two passes on the 
overlayed trellis. For transmission channels with high signal to noise ratio, it is 
likely that decoding will be efficient. This is borne out by simulations on a code 
called the hexacode|2| on an additive white Gaussian noise(AWGN) channel, 
where it was seen that the decoding on the overlayed trellis was faster than that 
on the conventional trellis for signal to noise ratios of 2.5 dB or more|l4j. Future 
work will concentrate on investigating the existence of an efficient algorithm for 
finding a good decomposition of a code into a subgroup and its cosets, and on 
obtaining overlayed trellises for long codes. 
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Abstract. Recently there was a significant progress in proving (expo- 
nential-time) worst-case upper bounds for the propositional satisfiabil- 
ity problem (SAT) and related problems. In particular, for MAX-2-SAT 
Niedermeier and Rossmanith recently presented an algorithm with worst- 
case upper bound 0(A-2^/2.8S...^^ bound 0(A-2^/® '‘'‘ -) is im- 

plicit from the paper by Bansal and Raman {K is the number of clauses). 
In this paper we improve this bound to , where K 2 is the num- 

ber of 2-clauses, and p is a polynomial. In addition, our algorithm and 
the proof are much simpler than the previous ones. The key ideas are 
to use the symmetric flow algorithm of Yannakakis and to count only 
2-clauses (and not 1-clauses). 



1 Introduction 



SAT (the problem of satisfiability of a propositional formula in conjunctive 
normal form ( CNF )) can be easily solved in time of the order 2^, where N is the 
number of variables in the input formula. In the early 1980s this trivial bound was 
improved for formulas in 3-CNF by Monien and Speckenmeyer m (see also cni) 
and independently by Dantsin ^ (see also 00 ). After that, many upper bounds 
for SAT and its NP-complete subproblems were obtained are 

the most recent). Most authors consider bounds w.r.t. three main parameters: 
the length L of the input formula (i.e. the number of literal occurrences), the 
number K of its clauses and the number N of the variables occurring in it. In this 
paper we consider bounds w.r.t. the parameters K and L. The best such bounds 
for SAT are - IT^ and - P] {p is a polynomial). 

The maximum satisfiability problem {MAX- SAT ) is an important general- 
ization of SAT. In this problem we are given a formula in CNF, and the answer is 
the maximal number of simultaneously satisfiable clauses. This problem is NP- 
complete0 even if each clause contains at most two literals {MAX-2-SAT ; see, 
e.g., PH). This problem was widely studied in the context of approximation al- 
gorithms (see, e.g., [Ill Ifl 1 II 4|2tij ). As to the worst-case time bounds for the ex- 
act solution of MAX-SAT, Niedermeier and Rossmanith pni recently proved two 



* Supported by INTAS (project No. 96-0760) and RFBR (project No. 99-01-00113). 

^ A more precise NP-formulation is, of course, “given a formula in CNF and an integer 
k, decide whether there is an assignment that satishes at least k clauses”. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 65-^21 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




66 



Edward A. Hirsch 



worst-case upper bounds: for MAX-SAT, and 0{K for 

MAX-2-SAT. For the latter bound, they presented an algorithm for MAX-SAT 
running in 0(L • time; the desired bound follows since L < 2K. Bansal 

and Raman have recently improved the MAX-SAT bounds to 0{L ■ 2'^/2-35" ) 
and 0(L • 2^/6-89 -), which leads to the 0{K ■ 2^/3-44...) bound for MAX-2-SAT. 
Niedermeier and Rossmanith posed a question whether the bound for MAX- 
2-SAT can be improved by a direct algorithm (and not by an algorithm for 
general MAX-SAT for a bound w.r.t. L). In this paper, we answer this question 
by giving an algorithm which solves MAX-2-SAT in the p{K)2^/^ time (p is a 
polynomial). In addition, our algorithm and the proof are much simpler then 
those in 111101 - 

Most of the algorithms/bounds mentioned above use the Davis-Putnam pro- 
cedure I8IHI . In short, this procedure allows to reduce the problem for a formula 
F to the problem for two formulas F’)?;] and F’)!;] (where u is a propositional vari- 
able). This is called “splitting”. Before the algorithm splits each of the obtained 
two formulas, it can transform them into simpler formulas Ff and F 2 (using some 
transformation rules ) . The algorithm does not split a formula if it is trivial to 
solve the problem for it; these formulas are the leaves of the splitting tree which 
corresponds to the execution of such algorithm. For most known algorithms, the 
leaves are trivial formulas (i.e. the formulas containing no non-trivial clauses). 

In the algorithm presented in this paper, the leaves are satisfiable formulas 
and formulas for which the (polynomial time) “symmetric flow” algorithm of 
Yannakakis m finds an optimal solution (this algorithm either finds an optimal 
solution or simplifies the input formula). Transformation rules include the pure 
literal rule, a slightly generalized resolution rule (using these two rules one can 
solve MAX-SAT in a polynomial time in the case that each variable occurs at 
most twice; it was already observed in, e.g., 1231), the frequent 1-clause rule 
m, and the elimination of 1-clauses. Although in MAX-SAT 1-clauses cannot 
be eliminated by the usual unit propagation technique, in the case of MAX-2- 
SAT they can be eliminated by the symmetric flow algorithm of Yannakakis [26] . 
Thus, before each splitting we can transform a formula into one which consists of 
2-clauses, and each variable occurs at least three times. Therefore, each splitting 
eliminates at least three 2-clauses in each branch. This observation would already 
improve the bound of to p{K)2^'^^^ , where K 2 is the number of 2-clauses in 

the input formula, p is a polynomial. However, by careful choice of a variable for 
splitting, we get a better bound p{K)2^^^^, which implies the bound p{L)2^/^ 
(since L > 2 X 2 ). 

In Sect. El we give basic definitions and formulate in our framework the known 
results we use. In Sect. 0we present the algorithm and the proof of its worst-case 
upper bound. 



2 Background 

Let y be a set of Boolean variables. The negation of a variable v is denoted 
by V. Given a set U, we denote U = {u\u G U}. Literals (usually denoted 
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by 1,1' ,li,l 2 , ■ ■ ■) are the members of the seiW = VUV. Positive literals are 
the members of the set V . Negative literals are their negations. If w denotes a 
negative literal v, then W denotes the variable v. 

Algorithms for finding the exact solution of MAX-SAT are usually designed 
for the unweighted MAX-SAT problem. However, the formulas are usually rep- 
resented by multisets (i.e., formulas in CNF with integer positive weights). In 
this paper we consider the weighted MAX-SAT problem with positive integer 
weights. A (weighted) clause is a pair (uJ,S) where w is a strictly positive inte- 
ger number, and S' is a nonempty finite set of literals which does not contain 
simultaneously any variable together with its negation. We call u> the weight of 
a clause (w, S). 

An assignment is a finite subset of W which does not contain any variable 
together with its negation. Informally speaking, if an assignment A contains a 
literal I, it means that I has the value True in A. In addition to usual clauses, 
we allow a special true clause (w, T) which is satisfied by every assignment. (We 
also call it a T-clause.) 

The length of a clause (w, S) is the cardinality of S. A k-clause is a clause 
of the length exactly k. In this paper a formula in (weighted) CNF (or simply 
formula) is a finite set of (weighted) clauses (w, S), at most one for each S. The 
length of a formula is the sum of the lengths of all its clauses. The total weight 
of all clauses of a formula F is denoted by A{F). The total weight of all 2-clauses 
of a formula F is denoted by A. 2 {F). 

The pairs (0, S) are not clauses, however, for simplicity we write (0, S) € F 
for all S and all F. Therefore, the operators -I- and — are defined: 

F + G = {{uji + UJ 2 , S) I (uJijS) G F and ( 0 J 2 , S) G G, and u>i+ uj 2 > 0}, 

F — G = {{u)\ — L 02 , S) I (jJi, S) G F and {uj 2 , S) G G, and — W 2 > 0}. 

Example 1. If 

F={ (2,T), (3,{a:,y}), (4,{x,y})} 

and 

G={ (2, {a:,?/}), {A,{x,y}) }, 

then 

F'-G={(2,T), (l,{x,y}) }. 

□ 

For a literal I and a formula F, the formula F[l] is obtained by setting the 
value of I to True. More precisely, we define 

F[l] = ({(u;,S') I (uj,S) G F a.nd l,l i S} + 

{{uj,S\{J}) I (u;,S')GFandS'yf{I}, andleS'l-k 
{{uj, T) I w is the sum of the weights oj' of all clauses (w', S) of F 
such that I G S}. 
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(Note that no (o;,0) or (0,5') is included in F[^], F + G or F — G.) For an 
assignment A = {^i, . . . ,G} and a formula F, we define F[A\ = F[Zi][Z 2 ] ■ • ■ [G] 
(evidently, F[^][Z'] = F[F][^] for every literals I, I' such that I ^l')- For short, we 
write F[li, . . . , G] instead of F[{li, . . . , G}]- 

Example 2. If 

■P' = {(1.T), (l,{a;,?/}), (5,{y}), {2,{x,y}), (10, {z}), {2,{x,z})}, 

then 

F[x,z] = {(12,T), (7,M)}. 

□ 

The optimal value OptVal(F) = maxyi{w | (w,T) S F[A] }. An assignment 
A is optimal if F[A\ contains only one clause (w, T) (or does not contain any 
clauses, in this case w = 0) and OptVal(F) = uj {= OptVal(F[A]) ). 

A formula is in 2-CNF if it contains only 2-clauses, 1-clauses and a T-clause. 
A formula is in 2E-CNF if it contains only 2-clauses and a T-clause. 

If we say that a (positive or negative) literal v occurs in a clause or in a 
formula, we mean that this clause (more formally, its second component) or this 
formula (more formally, one of its clauses) contains the literal v. However, if we 
say that a variable v occurs in a clause or in a formula, we mean that this clause 
or this formula contains the literal f , or it contains the literal v. A variable v 
occurs positively, if the literal v occurs, and occurs negatively , if the literal v 
occurs. A literal I is an (i,j)-literal if I occurs exactly i times in the formula and 
the literal I occurs exactly j times in the formula. A literal is pure in a formula 
F if it occurs in F, and its negation does not occur in F. The following lemma 
is well-known and straightforward. 

Lemma 1. If I is a pure literal in F, then OptVal(F) = OptVal(F[Z]). 

In this paper, the resolvent d\{C,D) of clauses G = {uj\,{li,l 2 }) and D = 
(w 2 , {^ij^s}) is the formula 

{ (max(wi,o; 2 ), T), (min(u;i, ^ 2 ), {^ 2 ,^ 3 }) } 

if I 2 Iz, and the formula {(wi +W 2 ) T)} otherwise. This definition is not tradi- 
tional, but it is very useful in MAX-SAT context. 

The following lemma is a straightforward generalization of the resolution 
correctness (see, e.g., |21|) for the case when there are weights, but the literal 
on which we are resolving does not occur in other clauses of the formula. 

Lemma 2. If F contains clauses C = (wi,{z;,/i}) and D = (w 2 ,{F, ^ 2 }) such 
that the variable v does not occur in other clauses of F , then 

OptVal(F) = OptVal( (F - {G, D}) + 1H(G, D) ). 

The following simple observation is also well-known (see, e.g., [il Yipj l. 
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Lemma 3. Let F be a formula in weighted CNF, and v he a variable. Then 

OptVal(J^) = max( OptVal(F['i;]), OptVal(F[fJ]) ). 

We also note that a polynomial time algorithm for 2-SAT is known. In our 
context, a formula F is satisfiable if OptVal(F) is equal to the sum of the weights 
of all clauses occurring in F. 

Lemma 4 (see, e.g. |5]). There is a polynomial time algorithm for 2-SAT. 

Yannakakis presented in m an algorithm which transforms a formula in 2- 
CNF into a formula in 2E-CNF which has the same optimal value. This algorithm 
consists of two stages. The first stage is a removal of a maximum symmetric flow 
from a graph corresponding to the formula; this stage can be considered as a 
combination of three transformation rules (it is not important for us now which 
combination) : 

1. Replacing of a “cycle” 

{ {C0,{li,l2}), {C0,{l2,l3}), ■■■, (w,{^fe,^l})} 
by another cycle 

{ (w, {^ 1 , ^2}), {(^,{^2,13}), (w, {?fc, ^i}) }; 

2. Replacing of a set 

{(w,{^i}), (^,{^1,^2}), (^,{^2,^3}), {t^,{^k-l,lk}) } 

by the set 

{ {uj,{li,l 2 }), {uj, {12,13}), ■■■, {^^,{lk-i,lk}), 

3. Replacing two contradictory clauses {uj, {Z}) and {uj, { 7 }) by a true clause of 
the weight uj. 

The second stage is replacing of the obtained formula F' by the formula F'\A\ for 
some assignment A (it is not important for us now which assignment) . Evidently, 
this algorithm does not increase the total weight of all 2-clauses. Therefore, we 
can formulate the result of Yannakakis in the following form. 

Lemma 5 (P6j). There is a polynomial time algorithm which given an input 
formula F in weighted 2-CNF, outputs a formula G in weighted 2E-CNF, such 
that A 2 {G) < A. 2 {F), and OptVal(F) = OptVal(G). 

The following fact was observed by Niedermeier and Rossmanith. 

Lemma 6 (pO]). If the weight of a 1-clause {uj, {Z}) of a formula E is not less 
than the total weight of all clauses of F containing the literall, then OptVal(E) = 
OptVal(F[Z]). 

^ This replacing is made by subtracting the weights: e.g., if a formula contains a 
clause {u>' , {Zi, Z 2 }) with u>' > u), then it is split into two clauses {u>' — u), {Zi, Z 2 }) and 
(tj, {Zi, Z 2 }), and the latter clause is replaced as formulated. 



70 



Edward A. Hirsch 



3 Results 

In this section we present Algorithm Q] which solves MAX-2-SAT in the time 
where p is a polynomial, K is the total weight of all clauses in 

the input formula, and K 2 is the total weight of 2-clauses in it (in the case of 

unweighted MAX-2-SAT, K 2 is the number of 2-clauses). 

Algorithm 1. 

Input: A formula F in weighted 2-CNF. 

Output: OptVal(F). 

Method. 

(1) Apply the symmetric flow algorithm from (see Lemma E|) to F. 

(2) If there is a pure literal I in F, assume F := F[l], 

(3) If there is a variable that occurs in F exactly once positively in a clause C 
and exactly once negatively in a clause D, then F := {F—{C, D})+iR{C, D). 

(4) If F has been changed at steps (2)-(3), then go to step (I). 

(5) If is satisflable0, return the sum of the weights of all its clauses. 

(6) If there is a variable v such that the total weight of the clauses of F in which 
this variable occurs, is at least 4, then execute Algorithm Q] for the formulas 
i^[f] and and return the maximum of its answers. 

(7) Fincfl 'm. F & clause {uj,{li,l 2 }) such that li and I 2 are (2,l)-literals, and 
the two other clauses C and D containing the literals l 2 ,h do not contain 
the literals li,li- Execute Algorithm H for the formulas (E[Zi] — {C,D}) + 
1R{C,D), and F[li,l 2 ], and return the maximum of its answers. 



□ 



Theorem 1. Given a formula F in 2-CNF, Algorithm^ always correctly finds 
OptVal(E) in time p{A{F)) ■ where p is a polynomial. 

Proof. Correctness. If Algorithm ^ outputs an answer, then its correctness fol- 
lows from the lemmata of Sect. 0 (step (1): Lemma 0 step (2): Lemma 0 
step (3): Lemma 0 step (6): Lemma 0 step (7): Lemmata 0 0 and 0 where 
Lemma0is applied to the clause 1/2} in the formula E[Zi], note that at step (7) 
the formula F consists of clauses of weight 1). 

Since any change at steps (2) and (3) decreases the total weight of 2-clauses in 
F, and the step (1) does not increase it, the cycle (l)-(4) is repeated a polynomial 
number of times. Now it remains to show that at step (7) Algorithm 0 always 
can And a clause satisfying its conditions. 

Note that at step (7) the formula F is not satisflable, consists only of 2-clauses 
(and, maybe, a T-clause), does not contain pure literals, and each variable occurs 

® We can check it in a polynomial time 0, see Lemma 0 
Theorem 0 proves that it is possible to find a clause satisfying the conditions of this 
step. 
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in it exactly three times, i.e. F contains only (2,l)-literals and (l,2)-literals. 
Since F is not satisfiable, there exists at least one clause in it that contains two 
(l,2)-literals (otherwise each clause contains a (2,l)-literal, i.e. the assignment 
consisting of all (2,l)-literals is satisfying; cf. “Extended Sign Principle” of m)- 
Thus, (l,2)-literals occur in at most N — 1 clauses of F, where N is the number 
of variables occurring in F. There are 3N/2 2-clauses in F. Hence, F contains 
more than N/2 2-clauses consisting only of (2,l)-literals. There are at least 

-I- 1 literals in these clauses, thus, there is at least one (2,l)-literal occurring 
in two such clauses. This literal, and (at least) one of the two literals occurring 
with it in these clauses, satisfy the condition of the step (7). 

Running time. Each of the steps of Algorithm ^ (not including recursive 
calls) takes only a polynomial time (Lemmata 0 and ^ . The steps (l)-(5) do 
not increase the total weight of 2-clauses in F. By the above argument, each 
of these steps is executed a polynomial number of times during one execution 
of Algorithm Q] (again not including recursive calls). It suffices to show that for 
each formula F' which is an argument of a recursive call, ^ 2 {F') < ^ 2 {F) — 4. 

Note that at the moment before a splitting (which precedes a recursive call), 
the formula F consists only of 2-clauses (and, maybe, a T-clause). Then the 
statement follows from the conditions of the steps (6) and (7). □ 

Corollary 1. Given a formula F in unweighte 43 2-CNF of length L, Algo- 
rithm\^always correctly finds OptVal(E) in time p(L)2^/®, where p is a polyno- 
mial. 



Remark 1. Of course, in Corollary [Q only the number of literal occurrences in 
2-clauses is essential in the exponent. 

Remark 2. Algorithm 0 can be easily redesigned so that it finds the optimal 
assignment (or one of them, if there are several assignments satisfying the same 
number of clauses) instead of OptVal(F). 

4 Conclusion 

In this paper we improved the existing upper bound for MAX-2-SAT with integer 
weights to p{K)2^^^'^, where K 2 is the total weight of 2-clauses of the input 
formula (or the number of 2-clauses for unweighted MAX-2-SAT), K is the total 
weight of all clauses, and p is a polynomial. This also implies the p{L)2^/^ bound 
for unweighted MAX-2-SAT, where L is the number of literal occurrences. 

One of the key ideas of our algorithm is to count only 2-clauses (since MAX- 
I-SAT instances are trivial) . It would be interesting to apply this idea to SAT, for 
example, by counting only 3-clauses in 3-SAT (since 2-SAT instances are easy). 
Also, it remains a challenge to find a “less-than-2^” algorithm for MAX-SAT or 
even MAX-2-SAT, where N is the number of variables. 

I.e., all weights equal 1. 
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We did not investigate the practical performance of our algorithm and even 
did not give “practical” implementation. The “abstract” polynomial factor in 
the bound could be replaced by a concrete (low degree) polynomial if such im- 
plementation is given. 
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Abstract. The resource-bounded measures of certain classes of lan- 
guages are shown to be invariant under certain changes in the under- 
lying probability measure. Specifically, for any real number J > 0, any 
polynomial-time computable sequence (3 — (/3q, / 3i, . . . ) of biases Pi G 
[J, 1 — J], and any class C of languages that is closed upwards or down- 
wards under positive, polynomial-time truth-table reductions with linear 
bounds on number and length of queries, it is shown that the following 
two conditions are equivalent. 

(1) C has p-measure 0 relative to the probability measure given by /3. 

(2) C has p-measure 0 relative to the uniform probability measure. 

The analogous equivalences are established for measure in E and measure 
in E 2 . (Breutzmann and Lutz [Sj established this invariance for classes C 
that are closed downwards under slightly more powerful reductions, but 
nothing was known about invariance for classes that are closed upwards.) 
The proof introduces two new techniques, namely, the contraction of a 
martingale for one probability measure to a martingale for an induced 
probability measure, and a new, improved positive bias reduction of one 
bias sequence to another. Consequences for the BPP versus E problem 
and small span theorems are derived. 



1 Introduction 

Until recently, all research on the measure-theoretic structure of complexity 
classes has been restricted to the uniform probability measure. This is the prob- 
ability measure /r that intuitively corresponds to a random experiment in which 
a language A C {0, 1}* is chosen probabilistically, using an independent toss of 
a fair coin to decide whether each string is in A. When effectivized by the meth- 
ods of resource-bounded measure m, /i induces measure-theoretic structure on 

* This research was supported in part by National Science Foundation Grants CCR- 
9157382 (with matching funds from Rockwell, Microware Systems Corporation, and 
Amoco Foundation) and CCR-9610461. This work was done while the second author 
was at Iowa State University. 
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E = DTIME(2““‘^‘'), E 2 = DTIME(2P°'y"°™’^^), and other complexity classes. 
Investigations of this structure by a number of researchers have yielded many 
new insights over the past seven years. The recent surveys umiEi describe 
much of this work. 

There are several reasons for extending our investigation of resource-bounded 
measure to a wider variety of probability measures. First, such variety is essen- 
tial in cryptography, computational learning, algorithmic information theory, 
average-case complexity, and other potential application areas. Second, applica- 
tions of the probabilistic method |2] often require use of non-uniform probabil- 
ity measures, and this is likely to hold for the resource-bounded probabilistic 
method dlSl as well. Third, resource-bounded measure based on non-uniform 
probability measures provides new methods for proving results about resource- 
bounded measure based on the uniform probability measure |^. 

Motivated by such considerations, Breutzmann and Lutz initiated the 
study of resource-bounded measure based on an arbitrary (Borel) probability 
measure v on the Cantor space C (the set of all languages) . (Precise definitions 
of these and other terms appear in the expanded version of this paper.) Kautz m 
and Lutz HH have furthered this study in different directions, and the present 
paper is another contribution. 

The principal focus of the paper is the circumstances under which the 
^-measure of a complexity class C is invariant when the probability measure ly 
is replaced by some other probability measure ly'. For an arbitrary class C of 
languages, such invariance can only occur if ly and ly' are fairly close to one 
another: Extending results of Kakutani ^21, Vovk and Breutzmann and 
Lutz jnj, Kautz has shown that the “square-summable equivalence” of ly 
and v' is sufficient to ensure t'p(C) = 0 ^'p(C) = 0, but very little more can 

be said when C is arbitrary. 

Fortunately, complexity classes have more structure than arbitrary classes. 
Most complexity classes of interest, including P, NP, coNP, R, BPP, AM, P/Poly, 
PH, etc., are closed downwards under positive, polynomial-time truth-table re- 
ductions (<pQg_jj.-reductions), and their intersections with E are closed down- 
ward under <pgg_^j-reductions with linear bounds on the length of queries 
(<pQg”tt"i'®ductions). Breutzmann and Lutz proved that every class C with 
these closure properties enjoys a substantial amount of invariance in its measure. 
Specifically, if C is any such class and (3 and (3' are strongly positive, P-sequences 
of biases, then the equivalences 

/i^(C) = 0^Mf(C) = 0, 

^/3(C|E) = 0^M^'(C|E) = 0, (1) 

M^(C|E2) = 0^m'3'(C|E2) = 0 

hold, where /i^ and are the probability measures corresponding to the bias 
sequences (3 and /3', respectively. (Intuitively, if /3 = (/3 q, / 3i, . ■ . ) is a sequence 
of biases j3i G [0, 1], then the measure corresponds to a random experiment 
in which a language A C {0, 1}* is chosen by tossing for each string Si, indepen- 
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dently of all other strings, a special coin whose probability of heads is f3i . If the 
toss comes up heads, then Si S A; otherwise st ^ A.) 

Our primary concern in the present paper is to extend this bias invariance to 
classes that are closed upwards under some type <^ 0 ! polynomial reductions. 
We have two reasons for interest in this question. First and foremost, many recent 
investigations in complexity theory focus on the resource-bounded measure of 
the upper <^-span 

P-\A) = {B\A<^,B} (2) 

of a language A. Such investigations include work on small span theorems 
^ [ni [Z] and work on the BPP versus E question In general, the 

upper <^-span of a language is closed upwards, but not downwards, under <^- 
reductions. 

Our second reason for interest in upward closure conditions is that the above- 
mentioned results of Breutzmann and Lutz 0 do not fully establish the invari- 
ance of measures of complexity classes under the indicated changes of bias se- 
quences. For example, if /3 is an arbitrary strongly positive P-sequence of biases, 
the results of 0 show that 





=^/r(C|E) = 0, 


(3) 


but they do not show that 






^/3(C|E) = 1 ^ 


p{C\E) = 1 . 


(4) 



In general, the condition i^(C|E) = 1 is equivalent to |E) = 0, where is 
the complement of C. Since C is closed downwards under < ^-reductions if and 
only if is closed upwards under <^-reductions, we are again led to consider 
upward closure conditions. 

Our main theorem, the Bias Invariance Theorem, states that, if C is any class 
of languages that is closed upwards or downwards under positive, polynomial- 
time, truth-table reductions with linear bounds on number and length of queries 

if strongly positive P-sequences of 

biases, then the equivalences 0 above hold. The proof introduces two new 
techniques, namely, the contraction of a martingale for one probability measure 
to a martingale for an induced probability measure (dual to the martingale 
dilation technique introduced in and a new, improved positiue bias reduction 
of one bias sequence to another. 

We also note three easy consequences of our Bias Invariance Theorem. First, 
in combination with work of Allender and Strauss and Buhrman, van Melke- 
beek, Regan, Sivakumar, and Strauss 0, it implies that, if there is any strongly 
positive P-sequence of biases /3 such that the complete <^-degree for E 2 does 
not have /i^-measure 1 in E 2 , then E ^ BPP. Second, in combination with the 
work of Regan, Sivakumar, and Cai it implies that, for any reasonable com- 
plexity class C, if there exists a strongly positive P-sequence of biases /3 such 
that C has /r^-measure 1 in E, then E C C ( and similarly for E 2 ). Third, if 
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is any polynomial reducibility such that A B implies A B, and 

if /3 is a strongly positive P-sequence of biases, then the small span theorem for 
< ^-reductions holds with respect to if and only if it holds with respect to 
Tantalizingly, this hypothesis places “just beyond” the small span theorem 
of Buhrman and van Melkebeek Q, which is the strongest small span theorem 
proven to date for exponential time. 

Due to space limitations in these proceedings, all proofs are omitted from this 
version of our paper. An expanded version, available at http://www.cs.iastate.- 
edu/~lutz/papers.html, includes proofs of our results. 



2 Preliminaries 

We write {0,1}* for the set of all (finite, binary) strings, and we write |a;| for 
the length of a string x. The empty string. A, is the unique string of length 0. 
The standard enumeration of (0, 1}* is the sequence sq = A, si = 0, S2 = 1, S3 = 
00, ... , ordered first by length and then lexicographically. For x,y € (0, 1}*, we 
write X < y \i X precedes y in this standard enumeration. For n S N, {0, 1}" 
denotes the set of all strings of length n, and {0,1}-” denotes the set of all 
strings of length at most n. 

If a; is a string or an (infinite, binary) sequence, and if 0 < z < j < |a;|, 
then x\i..j] is the string consisting of the through bits of x. In particular, 
x[Q..i — 1] is the i-bit prefix of x. We write x[i] for x[i..i], the bit of x. (Note 
that the leftmost bit of x is a;[0], the O**' bit of a;.) 

If w is a string and x is a string or sequence, then we write w C x if w is a 
prefix of x, i.e., if there is a string or sequence y such that x = wy. 

The Boolean value of a condition (j> is |(/)] = if ^ then 1 else 0. 

We work in the Cantor space C, consisting of all languages A C {0, 1}*. We 
identify each language A with its characteristic sequence, which is the infinite 
binary sequence xa defined by 



XaN = |s„ e A] (5) 

for each n G N. Relying on this identification, we also consider C to be the 
set of all infinite binary sequences. The complement of a set X of languages is 
X^ = C-X. 

For each string w G {0, 1}*, the cylinder generated by w is the set 

C^ = {A&C\w\Zxa} ■ ( 6 ) 



2.1 Resource-Bounded iz-Measure 

Next, we briefly present the basic elements of resource-bounded measure based 
on an arbitrary probability measure jz on C. The remaining material in this 
section is excerpted, with permission, with permission, from [S]. 
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Definition 1. A probability measure on C is a function v : {0, 1}* — > [0, 1] 
such that v{\) = 1, and for all w G {0, 1}*, 

i>{w) = i^{w0) + i^{wl) . (7) 

Definition 2. A probability measure v on C is positive if for all w G {0, 1}*, 
ly(uj) > 0. 

Definition 3. If v is a positive probability measure and u,v G {0,1}*, then the 
conditional v-measure of u given v is 

( 1 if u^v 

Hu\v)=l^ifvQu ( 8 ) 

{ 0 otherwise . 

Note that v{u\v) is the conditional probability that A G C„, given that 
A G Cy, when A G C is chosen according to the probability measure v. 

Definition 4. A probability measure v on C is strongly positive if (v is positive 
and) there is a constant (5 > 0 such that, for all w G (0, 1}* and b G (0, 1}, 
u{wb\w) > S. 

Definition 5. A sequence of biases f3 = (/?o, /3i, /S 2 , • • ■ ) is strongly positive if 
there is a constant 5 > 0 such that, for all z G N, Pi G [5, 1 — i5]. 

Definition 6. The f3-coin-toss probability measure (also called the fi-product 
probability measure) is the probability measure defined by 

\w\-l 

p^{w)= {{I - Pi) ■ {I - w[i]) + Pi ■ w[i]) (9) 

i=0 



for all w G {0, 1}*. 

We next review the well-known notion of a martingale over a probability 
measure ly. Computable martingales were used by Schnorr EaEUEiiia in his 
investigations of randomness, and have more recently been used by Lutz m in 
the development of resource-bounded measure. 

Definition 7. Let v be a probability measure on C. Then a v -martingale is a 
function d : {0, 1}* — > [0, oo) such that, for all w G (0, 1}*, 

d{w)v{w) = d{wQ)v{wQ) d{wl)v{wl). 

To satisfy space constraints, we omit discussion of the success and com- 
putability of a zz-martingale, which are similar to the correponding notions for 
/r-martingales. 
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Definition 8. Let v be a positive probability measure on C, let A C {0, 1}*, and 
let z G N. Then the conditional u -probability along A is 

VA{i + l|z) = v{xA[^--i] I Xa[0..z - 1]) . (10) 

Definition 9. Two positive probability measures v and v' on C are summably 
equivalent, and we write v « v' , if for every A C {0, 1}*, 

OO 

\T^A{i + IK) - + IWI < OO . (11) 

Definition 10. 1. A P-sequence of biases is a sequence (3 = (/3q, Pi, P 2 , ■ ■ ■) of 
biases Pi G [0, 1] for which there is a function 

p-.NxN — >Qn[ 0 ,l] (12) 

with the following two^ properties. 

(i) For all i,r GN, \P{i, r) — Pi\ < 2“’’. 

(a) There is an algorithm that, for all i,r G N, computes P(i,r) in time 
polynomial in |si| + r (i.e., in time polynomial in log(z + 1) + r). 

2. A P-exact sequence of biases is a sequence P = (/3q, Pi, P 2 , • ■ • ) of (rational) 
biases Pi G QH [0, 1] such that the function i 1 — *■ Pi is computable in time 
polynomial in |si|. 

Definition 11. If a and /3 are sequences of biases, then a and /3 are summably 
equivalent, and we write olk, p, ifjffi^o \o(i — Pi\ < co- 
lt is clear that a « /3 if and only if « /i^. 

Lemma 1 (Breutzmann and Lntz [SI). For every P-sequence of biases P, 
there is a P-exact sequence of biases 0 such that /3 « /3^ 

2.2 Truth- Table Reductions 

A truth-table reduction (briefly, a <tt-reduction) is an ordered pair (/, g) of total 
recursive functions such that for each x G {0, 1}*, there exists n{x) G such 
that the following two conditions hold. 

(i) f{x) is (the standard encoding of) an n(x)-tuple (fi{x),... , fn{x)0)) of 
strings fi{x) G {0, 1}*, which are called the queries of the reduction (/, g) on 
input X. We use the notation Q{f.g){x) = {/i(a;), . . . , fn{x){x)} for the set of 
such queries. 

(ii) g{x) is (the standard encoding of) an n(a;)-input, 1-output Boolean circuit, 
called the truth table of the reduction (/, g) on input x. We identify g{x) 
with the Boolean function computed by this circuit, i.e.. 



g{x) ■- 



{ 0 , 1 } . 



(13) 
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A truth-table reduction (/, g) induces the function 

PU,9) : C ^ C (14) 

P{f,g)i^') = {a; G {0, 1}* I g{x) {lfi{x) G A] • • • I/„(a;)(a;) G A]) = 1} . 

Similarly, the inverse image of the cylinder Cj, under the reduction (/, g) is 

= {A G C I z C %,)(A)} . (15) 

The following well-known fact is easily verified. 

Lemma 2. If v is a probability measure on C and (f,g) is a <tt~reduction, then 
the function 

: {0,1}* — >[0,1] (16) 

^(/.A(^) = ^(F-i^^(C,)) 
is also a probability measure on C. 

The probability measure of Lemma Elis called the probability measure 
induced by v and (f,g). 

In this paper, we use the following special type of <tt-reduction. 

Definition 12. A <tt~reduction (f,g) is orderly if, for all x,y,u,v G (0, 1}*, if 
X < y, u G Q(f^g){x), and v G Q(f^g){y), then u <v. That is, if x precedes y (in 
the standard ordering of {0, T}*), then every query of (f,g) on input x precedes 
every query of (f,g) on input y. 

3 Martingale Contraction 

Given a positive coin-toss probability measure v, an orderly truth-table reduc- 
tion (f,g), and a -martingale d (where is the probability measure 

induced by v and (/,<?)), Breutzmann and Lutz jS| showed how to construct a 
^-martingale {f,g)^d, called the {f , g)- dilation of d, such that {f,gf^d succeeds 
on A whenever d succeeds on F(y_g)(A). In this section we present a dual of this 
construction. Given v and (/, g) as above and a iz-martingale d, we show how 
to construct a -supermartingale {f,g)^d, called the {f,g)~ contraction of d, 
such that {f,g) d succeeds on A whenever d succeeds strongly on every element 
ofF(-/,)({A}).'' 

The notion of an (/, g)-step, introduced in 0, will also be useful here. 
Definition 13. Let (f,g) be an orderly <tt~reduction. 

1. An (/, (?)-step is a positive integer I such that 

2. For k gN, we let step{k) be the least {f,g)-step I such that I > k. 
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3. For v,w € {0, 1}*, we write v w to indicate that w Qv and |w| = step{\w\ + 
1). (That is, V w means that v is a proper extension ofw to the next step.) 



Our construction makes use of a special-purpose inverse of that depends 
on both (/, 5) and d. 

Definition 14. Let (f,g) be an orderly <tt-reduction, let v be a positive proba- 
bility measure on C, and let d be a v-martingale. Then the partial function 

is defined recursively as follows. 

(^) 

(ii) For w € {0,1}* and b G {0,1}, ^(w6) is the lexicographically first 

string v >- JyW) such that F(^f^gfiv) = wb and, for all v' >~ ^{w) 

such that F(^f g){y') = wb, we have d{v) < d{v'). (That is, v minimizes d{v) 
on the set of all v >~ fiw) satisfying F(^f gfiv) = wb.) 

Note that the function ^ is strictly monotone (i.e., w^w' implies that 
^if\) d(^) G 7^(7 g) <i(^75 provided that these values exist), whence it extends 
naturally to a partial function 



F 



-1 

(f,g),d 



(18) 



It is easily verified that inverts F(y g) in the sense that, for all x G 

{0, 1}* U C, ^ finds a preimage of F(y_g)(x), i.e., 



7^(/.9)(-^(/,g),d(7^(/.9)(^))) “ -^(/.g)(^) ■ (1®) 



We now define the (/, g)-contraction of a j/-martingale d. 



Definition 15. Let (f,g) be an orderly <tt-reduction, let v be a positive proba- 
bility measure on C, and let d be a v-martingale. Then the (/,(/) -contraction of 
d is the function 



(/,g)^d:{0,l}*^{0,l}* (20) 

defined as follows. 

0) {f,ghd{\) = d{\). 

(a) For w G {0, 1}* and b G {0, 1}, 

f diFfi} , j(wb)) if d(Ffi} , j(wb)) is defined 

(/,9)_ci(u*)= fif „ ' (21) 

{ 2 • (/, g)^d{w) otherwise. 



Theorem 1 (Martingale Contraction Theorem). Assume that v is a pos- 
itive probability measure on C, {f,g) is an orderly <tt-reduction, and d is a 
v-martingale. Then {f,g)^d is a v'^^'d) -supermartingale. Moreover, for every lan- 
guage A C {0,1}*, z/Fy ^)({yl}) C S^„[d], then A G S°°[{f, g)^d]. 
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4 Bias Invariance 

In this section we present our main results. 

Definition 16. Let (f,g) be a <tt-reduction. 

(fid) positive (briefly, a <pos-tt-reduction) if, for all A,BC {0,1}*, 
Ac B implies F(^f g^{A) C F(^j^g){B). 

(/iS) polynomial-time computable (briefly, a <f^-reduction) if the func- 
tions f and g are computable in polynomial time. 

3- if^g) is polynomial-time computable with linear-bounded queries (briefly, a 

'’^-reduction) if (f,g) is a <f^.-reduction and there is a constant c S N 
such that, for all x € {0,1}*, Q(f,g){x) C {0, 

4- . if,g) is polynomial-time computable with a linear number of queries (briefly, 

«<L- _^^-reduction) if (f,g) is a <((f.-reduction and there is a constant c G N 
such that, for all x G {0, 1}*, |Q(/_g)(a;)| < c(l + |a;|). 

Of course, a <pQg”tt-reduction is a <tt-reduction with properties 1-3, and a 
<pQg"iin_tt“reduction is a <tt-reduction with properties 1-4. 

We now present the Positive Bias Reduction Theorem. This strengthens the 
identically-named result of Breutzmann and Lutz by giving a <pog"ii„_tt- 
reduction in place of a <pQl!”jt-reduction. This technical improvement, which is 
essential for our purposes here, requires a substantially different construction. 
Details are omitted. 

Theorem 2 (Positive Bias Reduction Theorem). Let (3 and (3' be strongly 
positive, P-exact sequences of biases. Then there exists an orderly <pQg”ij„_tt- 
reduction (f,g), and the probability measure induced by and (f,g) is a coin- 
toss probability measure , where (3” « /3'. 

The following result is our main theorem. 

Theorem 3 (Bias Invariance Theorem). Assume that (3 and (3' are strongly 
positive P-sequences of biases, and let C be a class of languages that is closed 
upwards or downwards under <^’’^^^y,.^_^,^-reductions. Then 

M^(C) = 0^Mf(C) = 0 . (22) 

The “downwards” part of Theorem Elis a technical improvement of the Bias 
Equivalence Theorem of |S| from <pQg”jt-reductions to <pos”ii„_tt-reductions. 
The proof of this improvement is simply the proof in [3| with Theorem Q used 
in place of its predecessor in 0 . 

The “upwards” part of Theorem 0 is entirely new. The proof of this result is 
similar to the proof of the Bias Equivalence Theorem in E], but now in addition 
to using our improved Positive Bias Reduction Theorem, we use the Martingale 
Contraction Theorem of section 0 in place of the Martingale Dilation Theorem 
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of 0. We also note that the linear bound on number of queries in Theorem |2] is 
essential for the “upwards” direction. 

If is a polynomial reducibility, then a class C is closed upwards under 
<^-reductions if and only if is closed downwards under < ^-reductions. We 
thus have the following immediate consequence of Theorem 0 

Corollary 1. Assume that f3 and f3' are strongly positive P-sequences of biases, 
and let C be a class of languages that is closed upwards or downwards under 
<pos-ii„-tt-’^edMctions. Then 

<(C) = l^Mf(C) = l ■ (23) 

We now mention some consequences of Theorem 0 beginning with a dis- 
cussion of the measure of the complete < ^-degree for exponential time, and its 
consequences for the BPP versus E problem. 

For each class T> of languages, we use the notations 

Ht{'D) = {A\A is <^-hard for V}, (24) 

Ct{T) = {A\A is <^-complete for T)}, (25) 

and similarly for other reducibilities. The following easy observation shows that 
every consequence of ^(Ct(E 2 )|E 2 ) yf 1 is also a consequence of /t(Ct(E)|E) 1. 

Lemma 3. ^(Ct(E)|E) y^ 1 => /t(Ct(E 2 )|E 2 ) yf 1. 

Proof. Juedes and Lutz m have shown that, if X is a set of languages that 
is closed downwards under < ((,-reductions, then pl{X\Pi2) = 0 => p.{X\Pi) = 0. 
Applying this result with X = TL^fEiY = ”^t(E 2 )'^ yields the lemma. 

Allender and Strauss 0 have proven that /rp(7fT(BPP)) = 1. Buhrman, van 
Melkebeek, Regan, Sivakumar, and Strauss 0 have noted that this implies that 
P,(Ct(E 2 )|E 2 ) yf 1 => E 2 BPP. Combining this argument with Corollary ^ 
yields the following extension. 

Corollary 2. If there exists a strongly positive P-sequence of biases (3 such that 
/t^(Ct(E2)|E2) Y 1, then E ^ BPP. 

Regan, Sivakumar, and Cai m have proven a “most is all” lemma, stating 
that if C is any class of languages that is either closed under finite unions and 
intersections or closed under symmetric difference, then ^(C|E) = 1 =4> E C C. 
Combining this with Corollary[I] gives the following extended “most is all” result. 



Corollary 3. Let C be a class of languages that is closed upwards or downwards 
under <^^Y-^^^_^^.-reductions, and is also closed under either finite unions and in- 
tersections or symmetric difference. If there is any strongly positive, P-sequence 
of biases (3 such that pP{C\Ei) = 1, then E C C. 

Of course, the analagous result holds for E 2 . 
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We conclude with a brief discussion of small span theorems. Given a polyno- 
mial reducibility <^, the lower <^-span of a language A is 

P,(A) = {B\B <P A} , (26) 

and the upper <^-span of A is 

P;\A) = {B\A<^ B} . (27) 

We will use the following compact notation. 

Definition 17. Let <)( be a polynomial reducibility type, and let v be a proba- 
bility measure on C. Then the small span theorem for -reductions in the class 
E ouer the probability measure v is the assertion 

SST,(<P,E) (28) 

stating that, for every A gE, j/(Pr(A)|E) = 0 or t'p(P“^(A)) = iz(P“^(A)|E) = 
0. When the probability measure is p,, we omit it from the notation, writing 
SST(<^,E) for SSTp(<^,E). Similar assertions for other classes, for example, 
SSTy(<^,E 2 ), are defined in the now-obvious manner. 

Juedes and Lutz 0 proved the first small span theorems, SST(<((,,E) and 
SST(<((,,E 2 ), and noted that extending either to would establish E % BPP. 
Lindner ^ established SST(<^_^^,E) and SST(<^_^^, E 2 ), and Ambos-Spies, 
Neis, and Terwijn 0 proved SST(<^_j^,E) and SST(<^_^j, E 2 ) for all fixed 
k G N. Very recently, Buhrman and van Melkebeek 0 have taken a major 
step forward by proving SST(<^j.^^_^^, E 2 ) for every function g{n) satisfying 
g{n) = We note that the Bias Invariance Theorem implies that small span 
theorems lying “just beyond” this latter result are somewhat robust with respect 
to changes of biases. 

Theorem 4. If is a polynomial reducibility such that A B im- 

plies A B, then for every strongly positive P-sequence of biases (3, 

SSTp^ (<P, E) ^ SST(<P, E) , (29) 



and similarly for E 2 . 

Acknowledgment. The first author thanks Steve Kautz for a very useful 
discussion. 
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Abstract. We clarify the computational complexity of planarity testing, 
by showing that planarity testing is hard for L, and lies in SL. This nearly 
settles the question, since it is widely conjectured that L = SL |2S|- 
The upper bound of SL matches the lower bound of L in the context of 
(nonuniform) circuit complexity, since L/poly is equal to SL/poly. 
Similarly, we show that a planar embedding, when one exists, can be 
found in FL®'". 

Previously, these problems were known to reside in the complexity class 
AC^, via a O(logn) time CRCW PRAM algorithm [22| . although pla- 
narity checking for degree-three graphs had been shown to be in SL 

ESldoi- 



1 Introduction 

The problem of determining if a graph is planar has been studied from sev- 
eral perspectives of algorithmic research. From most perspectives, optimal algo- 
rithms are already known. Linear-time sequential algorithms were presented by 
Hopcroft and Tarjan nm and (via another approach) by combining the results 
of IHElIHl. In the context of parallel computation, a logarithmic-time CRCW- 
PRAM algorithm was presented by Ramachandran and Reif m that performs 
almost linear work. 

From the perspective of computational complexity theory, however, the sit- 
uation has been far from clear. The best upper bound on the complexity of 
planarity that has been published so far is the bound of AC^ that follows from 
the logarithmic-time CRCW-PRAM algorithm of Ramachandran and Reif . 
In a recent survey of problems in the complexity class SL P], the planarity 
problem for graphs of hounded degree is listed as belonging to SL, but this is 
based on the claim in that checking planarity for bounded degree graphs 
is in the “Symmetric Complementation Hierarchy”, and on the fact that SL is 
closed under complement ^0] (and thus this hierarchy collapses to SL). However, 
the algorithm presented in 1231 actually works only for graphs of degree 3, and 
no straightforward generalization to graphs of larger degree is known. (This is 

* Supported in part by NSF grant CCR-9734918. 

** Part of this work was done when this author was supported by the NSF grant CCR- 
9734918 on a visit to Rutgers University during summer 1999. 
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implicitly acknowledged in |23 pp. 518-519].) Interestingly, Mario Szegedy has 
pointed out to us (personal communication) that an algebraic structure proposed 
by Tutte |28^ , when combined with more recent results about span programs and 
counting classes m, gives a 0L algorithm for planarity testing. It is listed as 
an open question by Ja’Ja’ and Simon m if planarity is in NL, although the 
subsequent discovery that NL is closed under complementation imiTi allows 
one to verify that one of the algorithms of mini can in fact be implemented 
in NL. It remains an open question if their algorithm can be implemented in SL, 
but in this paper we observe that the algorithm of Ramachandran and Reif can 
be implemented in SL. 

We also show that the planarity problem is hard for L under projection 
reducibility. 

Recall that 

L C SL C NL C ACi 
SL C 0L. 

(See ^5-) L (respectively SL, NL) denotes deterministic (respectively symmetric, 
nondeterministic) logarithmic space, AC^ denotes problems solvable by polyno- 
mial size AND-OR circuits of logarithmic depth, where the gates are allowed to 
have any number of inputs. The class 0L consists of problems solvable by nonde- 
terministic logspace machines with an odd number of accepting paths. Although 
it is not known if NL is contained in 0L, it is known that NL is contained in 
0L/poly 0. 

This essentially solves the question of planarity from the complexity-theoretic 
point of view. To see this, it is sufficient to recall that it is widely conjectured 
that SL = L. This conjecture is based on the following considerations: 

— The standard complete problem for SL is the graph accessibility problem 
for undirected graphs {UGAP). Upper bounds on the space complexity of 
UGAP have been dropping, from log^ n PHI? through log^'® n to log^^^ n 
13 . It is suspected that this trend will continue to eventually reach logn. 

— UGAP can be solved in randomized logspace 0. Recent developments in 
derandomization techniques have led many researchers to conjecture that 
randomized logspace is equal to L El- 

In the context of nonuniform complexity theory (for example, as explored in 
^,0), the corresponding nonuniform complexity classes L/poly and SL/poly 
are equalfl Hence in this setting, the computational complexity of planarity is 
resolved; it is complete for L/poly under projections. 

One consequence of our result is that counting the number of perfect match- 
ings in a planar graph is reducible to the determinant, when the graph is pre- 
sented as an adjacency matrix. More precisely, it follows from this paper and 
from (rq that there is a (nonuniform) projection that takes as input the ad- 
jacency matrix of a graph G, and produces as output a matrix M with the 

^ That is, a universal traversal sequence 0 can be used as an “advice string” to enable 
a logspace-bounded machine to solve UGAP. 
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property that if G is planar then the absolute value of det{M) is the number 
of perfect matchings in G. {Sketch: Given the proper advice strings, a GapL al- 
gorithm can take as input the matrix M, compute its planar embedding (since 
this is in L/poly), then compute its “normal form embedding” along a unique 
computation path (since NL C U L/poly 121 ). and then use the algorithm in nn 
to compute a number whose absolute value is the number of perfect matchings 
in M . Since the determinant is complete for GapL under projections, the result 
follows.) 

The paper is organized as follows. In Section |21 we present our hardness result 
for planarity. In Section Elwe sketch the main parts of our analysis, showing that 
the algorithm of 1221 can be implemented in SL. 

2 Hardness of Planarity Testing 

The following problem is known to be complete for L: 

Definition 1. Undirected Forest Accessibility (UFA): Given an undirected forest 
G and vertices u, v, decide if u and v are in the same tree. 

The hardness of this problem for L was shown in jS], where only an NC^ 
reduction is claimed. However, it is easy to see that this problem is actually 
hard under uniform projections as well. (To see this, consider any L machine M; 
assume that configurations are time-stamped, that there is a unique accepting 
configuration Ch whose successor is itself (with the next time-stamp), and that 
M decides within time for some constant k. Construct the polynomially sized 
computation graph G where time-stamped configurations are the vertices, and 
edges denoting machine moves are labeled by input bits or their negations. M 
accepts its input if and only if (G, (cq, 0), {ch, n^)) is an instance of UFA, where 
Co is the initial configuration. ) 

Let G' be the complete graph on 5 vertices, minus any one edge {p,q). 

The graph H is obtained by identifying vertices u and v oi G (from the UFA 
instance) with vertices p and q of GL Clearly, H is planar if and only if (G, u, v) 
is not in UFA. 

We have thus proved the following theorem: 

Theorem 1. Planarity testing is hard for L under projections. 

It is worth noting that planarity testing remains hard for L even for graphs 
of degree 3. This does not follow from the construction given above, but can be 
shown by modifying a construction in Q. Details will appear in the final paper. 

3 The SL Algorithm for Planarity Testing and Embedding 

We describe here how the algorithm of Ramachandran and Reif [22! can be im- 
plemented in SL. The algorithm of Ramachandran and Reif is complex, and it 
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involves a number of fairly involved technical definitions. Due to space limita- 
tions, it is not possible to present all of the necessary technical definitions here. 
Therefore, we have written this section so that it can be read as a companion 
to 221 • We use the same notation as is used in ( 221 , and we will show that each 
step of their algorithm can be computed in L'^ , or by NC^ circuits with oracle 
gates for the reachability problem in undirected graphs. Since SL is closed under 
NC^ reducibility and logspace-Turing reducibility m, it follows that the entire 
algorithm can be implemented in SL. 

Our approach will be as follows. First, we present some general-purpose al- 
gorithms for operating on graphs and trees. Next, we show how an open ear 
decomposition can be computed in SL; the parallel algorithm to perform this 
step is also fairly complex, and space limitations prevent us from presenting all 
of the details for this step. Therefore, we present this section so that it can be 
used as a companion to the presentation of the open ear decomposition algo- 
rithm of Ramachandran as given in m- Finally, we go through the other steps 
of the algorithm of 22 |. 

3.1 Elementary Graph Computations in SL 

Our method of exposition in this subsection is to give a statement of the sub- 
problem to be solved, and then in parentheses give an indication of how this 
subproblem can be restated in a way that makes it clear that it can be solved 
using an oracle for undirected reachability, or by making use of primitive oper- 
ations that have already been discussed. 

Given a graph G, the following conditions can be checked in SL: 

1. Are u and v in the same 2-component? (Algorithm: for each vertex x, check 
if the removal of x separates u and v. This can be tested using UGAP.) 

2. Let each 2-component be labeled by the smallest two vertices in the 2- 
component. Is {u,v) the “name” of a 2-component? (First check that u and 
V are in the same 2-component, and then check that no a; < max(u,u) with 
X ^ {m, u} is in the same 2-component.) 

3. Is rt a cut- vertex? (Are there vertices v, w connected in G but not in G—{u}7) 

4. Is there is a path (not necessarily simple) of odd length between vertices s 
and tl (Make two copies of each vertex. Replace edge {u, v) by edges (uO, ul) 
and (uljuO). Check if sO,tl are connected in this new graph.) 

5. Is G bipartite (i.e. 2-colorable)? [221I23I3- 

6. If G is connected, 2-colorable, and vertex 1 is colored 1, is vertex i colored 
2? (Test if there is a path of odd length from 1 to i.) 

7 . Is edge e in the lexicographically first spanning tree T of G (under the 
standard ordering of edges)? 2111 

Given a graph G and a spanning tree T, the following conditions can be checked 
in SL: 

1. For e G T with e = {x, y), does x — > y occur at position i of the lexicograph- 
ically first Euler tour rooted at r, ETrl Does x — > y precede y — > xl (In 
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logspace, one can compute the lexicographically-first Euler tour by starting 
at r and following the edge r — > x, where x is the smallest neighbor of r in 
T. At any stage in the tour, if the most recent edge traversed was u — > v, 
the next edge in the Euler tour is v — > z where z is the smallest neighbor of 
V greater than it in T if such a neighbor exists, and z is the smallest neighbor 
of V otherwise.) 

2. Is u = parent(y) when T is rooted at r? (Equivalently, is u — > v the first 
edge of ETr to touch vl This can be checked in L.) 

3. If T is rooted at r, is u a descendant of vl (Equivalently, does the first 
occurrence of u in ETr he between the first and last occurrences of vl) 

4. Is z the least common ancestor (lea) of vertices x and y in T1 (Check that x 
and y are both descendants of z, and check that this property is not shared 
by any descendant of z.) 

5. Is i the preorder number of vertex ul (Count the number of vertices with 
first occurrence before that of u in ETr.) 

6. Is vertex u on the fundamental cycle Ce created by non-tree edge e with T1 
(Let e = (p, q). Vertex u is on C'e iff the graph T — {u\ has no path from p 
to q.) 

7. Is edge f on C el (This holds iff / = e or / G T and both endpoints of / are 
on Ce.) 

8. Are vertices u, v on the same bridge with respect to C'e? (See for a 
definition of “bridge”. Vertices u and v are on the same bridge iff there is 
a path from it to i; in G, with no internal vertices of the path belonging to 
Ce.) 

9. Are edges /, g on the same bridge with respect to Ce? (This holds H f,g ^ Ce, 
neither / nor g is a trivial bridge (i.e. a chord of Ce), and the endpoints of 
f,g which are not on Ce are on the same bridge with respect to Ce.) 

10. Is vertex u a point of attachment of the bridge of Ce that contains edge /? 
(Let / = (/i, / 2 ). If both /i and /2 are on Ce, then these are the only points 
of attachment of the trivial bridge {/}. Otherwise, if fi is not on Ce, then 
u is a point of attachment iff u G Ce and u, ft are on the same bridge with 
respect to Ce.) 

11. Given vertices u, v, on Ce, and given a sequence {wi,W 2 , . . .), is there a 
path from u to u along Ce avoiding vertices {wi,W 2 , . ■ .)? (This is simply the 
question of connectivity in Ce — {wi, u; 2 , • ■ •}•) 

12. Relative to Ce, do the bridges containing edges / and g interlace? (See m 
for a definition of “interlacing” . Either there is a triple u, v, w where all 
three vertices are points of attachments of both bridges, or there is a 4-tuple 
u,v,w,x where (1) u,w are attachment points of the bridge containing /, 
(2) v, X are attachment points of the bridge containing p, and (3) u, v, w, x 
occur in cyclic order on Ce. To check cyclic order, use the previous test.) 

3.2 Finding an Open Ear Decomposition 

We follow the exposition from m The algorithm in Figure [U finds an open ear 

decomposition of a graph: it labels each edge e by the number of the first ear 

containing e. 
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input: biconnected graph G; vertices v, r; edge e 

1: Find a spanning tree T, and number the vertices in preorder from 0 to n — 1 
with respect to root r. 

2: Label the non-tree edges: 

2.1 For each vertex v other than r, find low{v). Mark v if low{v) < parent(v). 

2.2 Construct multigraph H: Vh = 

For each e ^ T with distinct base vertices x, y, put edge {x, y) in Eh', 

For each e ^ T with a single base vertex x, put edge (x,x) in Eh’, 

2.3 Find the connected components of H. Label a component C by 
preorder {parent {a)) for some (arbitrary) a € C. 

2.4 Within each component C, find a spanning tree Tc, root it at a marked 
vertex if one exists, and preorder the vertices 0, 1, . . . , fc. 

2.5 For e = {parent{y),y) £ Tc, label e with the pair (label{C),y). 

For e ^ Tc, label e with the pair (label{C), k + 1). 

2.6 For e ^ T, label e with the label of the corresponding edge in H. 

2.7 Sort labels in non-decreasing order and relabel as 1,2,.... 

3: Label a tree edge {parent{v),v) by the smallest label of any non-tree edge inci- 
dent on a descendant of v (including v). 

4: Relabel the non-tree edge labeled 1 by the label 0. 



Fig. 1. Open Ear Decomposition Algorithm 



In this procedure, most of the computations involve computing spanning 
trees, finding connected components, preordering a tree, and sorting labels, all 
of which can be performed in SL. The only new steps here are the computation 
of base vertices and low vertices. These are also easily seen to be computable in 
SL using the operations from Subsection 13.11 note that 

1 . z is a base vertex of non-tree edge e = {x,y) if parent{z) = lca{x,y) and 
either x or y is a descendant of z. 

2. low{v) is the smallest w such that w G Ce for a non-tree edge e = ( 61 , 62 ) 
with 6 i or 62 a descendant of v. 

3.3 An Overview of the Algorithm 

The planarity testing algorithm of Ramachandran and Reif [221 is outlined in 
Figure 0 If G* is not 2-colorable (step 2.7), or if step 2.8 yields an embedding 
that is not planar, then the input graph is not planar. Otherwise, this procedure 
gives a planar combinatorial embedding of the input graph. For the complete 
algorithm and definitions of the terms used above, see m- 

The emphasis in [221 is to find a fast parallel algorithm that performs almost 
optimal work. However, for our purpose, any procedure that can be implemented 
in SL will do. Step 1 can be accomplished by determining, for each (u, v), if u and 
V are in the same biconnected component. Step 2.1 was addressed in subsection 
13. 2t Step 2.4 has been discussed in subsection 13. 1 1 The remaining steps are 
discussed in the following subsections. 
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1: Decompose the input graph into its biconnected components. 

2: For each biconnected component G, do 

2.1 Find an open ear decomposition D = (Pq, Pi, , Pr-i) with Pq = (s, t). 

2.2 Direct the ears to get directed acyclic graph Gat. 

2.3 Construct the local replacement graph Gi and the associated spanning tree 
Tat and paths D' = {Pg, P[,..., Pr-i)- 

2.4 Compute the bridges of each fundamental cycle G't. 

2.5 Compute a bunch collection for each P', and a hook for each bunch. 

2.6 For each P', construct its bunch graph and the corresponding interlacing 
parity graph. 

2.7 Construct the constraint graph G* and 2-color it, if possible. 

2.8 From the 2-coloring, obtain a combinatorial embedding of G; and hence 
G. Test if this embedding is planar. 

3: Piece together the embeddings of the biconnected components. 



Fig. 2. Planarity Testing Algorithm 



3.4 Constructing the Directed st-Numbering Graph Gst 

Given an open ear decomposition D = [Pq, ■ ■ ■ Pr-i] of a biconnected graph G, 
where Pq consists of the edge (s,t), the graph Ggt is the result of orienting each 
edge of G, so that 

— The edge (s,t) is oriented s — > t. 

— Let the two endpoints of an ear Pi be the vertices u and v. 

• If ear(v) < ear(u), then all edges on Pi are oriented to form a path from 
V to u. 

• If ear(u) < ear(v), then all edges on Pi are oriented to form a path from 
u to V. 

• If ear(u) = ear(v), then all edges on Pi are oriented to form a path from 
M to r; if M comes before v in the orientation on Pear{v)j and are oriented 
to form a path from v to u otherwise. 

Gst is acyclic, and every vertex lies on a path from s to tm- 

We show that Gst can be computed from G and D in logarithmic space. 
Orienting the edges in ear Pi is easy if ear(u) yf ear(v). The routine shown 
in Figure 13 shows how to orient the edges if ear{u) = ear{y) = i' . It is clear that 
this can be implemented in L. 



3.5 Constructing the Local Replacement Graph Gi 

In G/, each vertex v is replaced by a rooted tree T„ with d{v) — 1 vertices, one 
for each ear containing v. The construction exploits the fact that in the directed 
graph Gst , deleting the last edge of each path Pi for i > 0 gives a spanning tree 
Tst- The construction introduces new vertices, and maps each Pi to a path P/ 
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Input {D, i) 

Find the endpoints u and v of Pi. Let ear{u) = ear{v) = i' . 

Note that Pi is oriented from u to d iff w comes before v in P^/. 

1. Let u' and v' be the endpoints of Pit. Compute the bit B such that 

{u comes before v in Pt) iff [{u' comes before v' in i') © B], 

(This can be done in logspace since Pi' is a path. The routine can start at u' and 
see if it encounters u or v first.) 

2. If ear(v') 7 ^ ear[u') 

Then we can compute the orientation directly. 

Else 

Let i” = ear{v') = ear{u') 

Let the endpoints of Pi" be u" and v" . 

Find and remember the bit B' such that 

[u' comes before v' in Pi") iff [(«” comes before v" in Pi") ©P'] 

(At this point, we can forget about u' and v'.) 
u' := m"; v' := v" \ %' := i"; B := P © P' 

GO TO statement 2. 
end. procedure 



Fig. 3. Orienting an ear. 



which is essentially the same as Pi, but has an extra edge involving a new vertex 
at each end. 

The construction of Gi proceeds in 3 phases. In the first / second phase, the 
first / last edge of each ear is rerouted to a possibly new endpoint via one of the 
new vertices. In the last phase, some of the new edges are further rerouted to 
account for parallel ears. 

The entire construction uses only the elementary operations described in 
subsection 1,4. 1 1 and so can be implemented in FL^'~. The implementation imme- 
diately yields the new directed graph G(,(, and a listing of the new left and right 
endpoints L(P/) and P(P/) of each path. 



3.6 Bunch Collections and Hooks 

In the spanning tree of the graph G(,j, each path P/ has a unique non-tree 
edge, which forms the fundamental cycle G' with respect to T^f.. In |22I, each 
bridge of G' is classified as spanning, anchor or non-anchor depending on how 
the attachment points of G' are placed with respect to P/. Since bridges can be 
computed in FL^'- (see subsection EH), this classification is also in SL. 

In the nomenclature of 1221 , bunches are approximations to bridges: bunches 
contain only the attachment edges of bridges. A bridge is represented by at 
least one and possibly more than one bunch, subject to certain conditions. The 
conditions are: (1) A non-anchor bunch must be the entire bridge. (2) A spanning 
bunch must contain all attachment points of the corresponding bridge on internal 
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vertices of P' and at least one edge attaching on L{P[). (3) Edges within a 
bunch must be connected in Gi without using vertices from C' or from the other 
bunches. (4) The bunch collection for each P/ must contain all attachments of 
bridges on its internal vertices and some attachment edges incident on L(P'). 
Bunch collections are computed using operations described in Subsection rm 

A representative edge for each anchor bunch B is the hook H{B), which also 
is used to determine a planar embedding if G turns out to be planar. H{B) 
is usually an attachment on C' — P[ of the bridge of C' that contains B. The 
exception is when L{P[) is the lea of the non-tree edge of P/, in which case 
H{B) may be the incoming tree edge to T(P/). Again, the entire procedure for 
computing hooks uses operations shown to be in SL in subsection 13. IL so H{B) 
can be computed in FL^'-. 

3.7 Bunch Graphs and Interlacing Parity Graphs 

Once the bunch collections are formed, the bunch graphs are constructed as 
follows: extend each path P/ to a path Qi by introducing a new edge between 
L{P') and a new vertex U{P'). Collapse each bunch B of P/ to a single node vb 
(which now has edges to some vertices of P/); thus B becomes a star Sb with 
center vb- Further, if B is an anchor bunch, include edge {U{PI),vb), and if B 
is a spanning bunch, include edge {R{PI),vb)- This gives the so-called bunch 
graph which can clearly be constructed in FL^'-. 

For each Ji{Qi), an interlacing parity graph Gtj is constructed as follows: 
There is a vertex vb for each star Sb, and a vertex for each triple (u,v,B) 
where u,v are attachment vertices of Sb on Qi, and u is an extreme (leftmost 
/ rightmost) attachment. Edges connect (1) a bunch vertex vb to all its chords 
(it, V, B), (2) bunch vertices vs, vt which share an internal (non-extreme) attach- 
ment vertex on Qi, and (3) each chord to its left and right chords, when they 
exist. The left and right chords are defined as follows: For chord [u,v,B), con- 
sider the set of chords {{u' ,v’ ,B’) \ B' ^ B ,u’ < u < v’ < v}; intuitively, these 
are chords of other bunches that interlace with B. The left chord of (u,v,B) is 
the chord from this set with minimum u'~, ties are broken in favor of largest v' . 
Right chords are analogously defined. 

All the information needed to construct Gij can be extracted from Ji{Qi) 
by a logspace computation. 



3.8 The Constraint Graph G* 

The constraint graph contains two parts. One is the union over all i of the 
interlacing parity graphs Gij, and thus can be constructed in FL^'~. The other 
part accounts for the fact that more than one bunch may belong to the same 
bridge, and hence all such bunches must be placed consistently (on the same 
side) with respect to a path or fundamental cycle. This part has paths of length 
1 or 2, called links, between anchor bunches and related bunches. Determining 
for each anchor bunch the length of the link, and its other endpoint, requires 
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information about Gij and computations described in subsection I, 'i. II and so the 
constraint graph G* can also be constructed in FL^'-. 

If G* is not 2-colorable, then G is not planar. If G* is 2-colorable, then the 
2-coloring yields a combinatorial embedding of G/. Testing whether G* is 2- 
colorable (i.e. bipartite), and obtaining a 2-coloring if one exists, is known to be 
in FL^'-; see for instance |2|. 



3.9 The Combinatorial Embedding of Gi and of G 

Given an undirected graph, a combinatorial embedding (/) is a cyclic ordering of 
the edges around each vertex. Replace each edge {u, v) by directed arcs {u, v) and 
{v, u) to give the arc set A. Then is a permutation on A satisfying 4>{{u, v)) = 
{u, w) for some w; i.e. 4> cyclically permutes the arcs leaving each vertex. Let R be 
the permutation mapping each arc to its inverse. The combinatorial embedding 
4> is planar iff the number of cycles f in (f>* = (f> o R satisfies Euler’s formula 
n + f = m+ l + c. {n, m, c are the number of vertices, undirected edges, 
connected components respectively.) 

The 2-coloring of G* partitions the non-P/ edges with respect to P/ in the 
obvious way (those that are to be embedded inside, and those that go outside). 
To further fix the cyclic ordering within each set, the algorithm of computes, 
for each vertex v, a set of “tufts” , which are the connected components of a graph 
that is easy to compute using the operations provided in subsection ft. II Each tuft 
is labeled with a pair of vertices (again, these labels are easy to compute), and 
then the tufts are ordered by sorting these labels. (Sorting can be accomplished 
in logspace.) The cyclic ordering for tufts is either increasing or decreasing by 
labels, determined by the 2-coloring. This cyclic ordering then yields an ordering 
(j) for all the arcs in Gi via a simple calculation. 

To check planarity of (j>, note that c can be computed in SL m, n and m 
are known, so the only thing left to compute is /. This can be computed in L 
as follows: Count the number of arcs a for which a = c(a), where c(a) is the 
lexicographically smallest arc on the cycle of cj>* containing a. 

Since G; is obtained from G by local replacements only, an embedding (j)' of 
G can be easily extracted from the embedding (j) oi Gf. just collapse vertices of 
Ty back into v. 



3.10 Merging Embeddings of 2-Components 

It is well-known that a graph is planar iff its biconnected components are planar; 
see for instance [zg. To constructively obtain a planar combinatorial embedding 
of G from planar combinatorial embeddings of its 2-components, note that the 
ordering of edges around each vertex which is not a cut-vertex is fixed within 
the corresponding 2-component. At cut-vertices, adopt the following strategy: 
Let w be a cut- vertex present in 2-components (rti, di), {u 2 , V 2 ), ■ ■ ■ {ud, Vd)- The 
edges of w in each of these components are ordered according (pi, (pd- Let Xi 
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be the smallest neighbor of w in the 2-component (ui,Vi). The orderings can be 
pasted together in FL^'- as follows: 

4>{w, z) = 4>j{w, z) if 2 is in the 2-component (uj, Vj) and z yf xj 

4>{w,Xj) = 4>~l-^{w,Xj+i), 

(j){w,Xd) = (j)'^^{w,xi). 

4 Open Problems 

— Is planarity testing hard for SL? Is it in L? Until these classes are proved to 
coincide, there still remains some room for improvement in the bounds we 
present in this paper. 

— Can any of the techniques used here be extended to construct embeddings of 
small genus graphs? For instance, what is the parallel complexity of checking 
if a graph has genus 1, and if so, constructing a toroidal embedding? 
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Abstract. We address the characterization of finite test-sets for cube- 
freeness of morphisms between free monoids, that is, the finite sets T 
such that a morphism / is cube-free if and only if f{T) is cube-free. We 
first prove that such a finite test-set does not exist for morphisms defined 
on an alphabet containing at least three letters. Then we prove that for 
binary morphisms, a set T of cube-free words is a test-set if and only 
if it contains twelve particular factors. Consequently, a morphism / on 
{a, b} is cube-free if and only if f {aabbababbabbaabaababaabb) is cube- free 
(length 24 is optimal). Another consequence is an unpublished result of 
Leconte: A binary morphism is cube-free if and only if the images of all 
cube-free words of length 7 are cube-free. 

We also prove that, given an alphabet A containing at least two letters, 
the monoid of cube-free endomorphisms on A is not finitely generated. 



1 Introduction 

At the beginning of the century, Thue |iH2Uj (see also Pi) worked on repe- 
titions in words. In particular, he showed the existence of a square-free infinite 
word over a three-letter alphabet, and the existence of an overlap- free (and thus 
cube-free) infinite word over a binary alphabet. Since these works, many other 
results on repetitions in words have been achieved (see 0 for a recent survey, 
and m for related works), and Thue’s results have been rediscovered in several 
instances (see for example m)- 

Thue obtained an infinite overlap-free word over a two-letter alphabet (called 
Thue-Morse word since the works of Morse US!) by iterating a morphism fj, 
= ab and ^(b) = ba). Morphisms are widely used to generate infinite 
words. To obtain an infinite word with some property P, one very often uses 
P-preserving-morphisms, called P-morphisms. Naturally, some studies concern 
such morphisms: Sturmian morphisms (see |1 4j for a recent survey), power-free 
morphisms H2!, square-free morphisms PS|. . . 

Our paper is concerned with cube-free morphisms. The close problem of 
infinite cube-free words generated by morphism has already been studied for 
instance in [DU bj . Necessary conditions or sufficient conditions for cube- freeness 
of a morphism can be found in the studies of the general case of fcth power- 
free morphisms mm- But characterizations of cube-freeness exist only for 
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morphisms from a two-letter or a three-letter alphabet El. In the case of a two- 
letter alphabet, an unpublished result of Leconte is: A morphism on a binary 
alphabet is cube-free if and only if the images of all cube-free words of length at 
most seven are cube-free. 

In this paper, we consider test-sets for cube-freeness of morphisms on an 
alphabet A, that is, subsets T of A* such that given any morphism / defined on 
A, / is cube-free if and only if f{T) is cube-free. The result of Leconte can be 
rephrased: The set of cube-free words over {a, b} of length at most 7 is a test-set 
for morphisms on {a, 6}. Similar results have been obtained for other property- 
free morphisms. Among which, let us mention that on a three-letter alphabet, 
a morphism is square-free if and only if the images of all square-free words of 
length at most 5 are square-free (Crochemore |S]). In |^, Berstel and Seebold 
show that an endomorphism / on {a, 6} is overlap-free if and only if the images 
of all overlap-free words of length at most 3 are overlap- free or, equivalently, if 
/{abbabaab) is over lap- free. In 1171 . Richomme and Seebold improve this result 
showing that an endomorphism / on {a, b} is overlap-free if and only if f(bbabaa) 
is over lap- free. More precisely, they characterize all the finite test-sets for overlap- 
freeness of binary endomorphisms, that is, each set S such that a morphism / 
is overlap-free if and only if given any word w in S, f(w) is overlap-free. 

In Section 0 we give our main result which is a characterization of test-sets 
for cube-free morphisms on a two-letter alphabet: The set of factors of such a 
test-set just has to contain a particular finite subset. As one of the consequences 
of this characterization, we show that a morphism / defined on {a, b} is cube-free 
if and only if the word f (aabbababbabbaabaababaabb) is cube-free. Length 24 of 
this test-word is optimal: No word of length 23 or less can be used to test the 
cube-freeness of a morphism on a binary alphabet. 

In Section El we also show that any test-set for cube-free morphisms defined 
on an alphabet containing at least three letters is infinite. Thus, test-sets give 
no general effective way to determine whether a morphism is cube-free. 

The set of cube-free morphisms forms a monoid. When a monoid of mor- 
phisms is finitely generated, we have a natural way to determine if a morphism 
belongs to this monoid. Such a situation is known for instance for overlap-free 
morphisms . In Section 01 we show that this is no longer true for cube-free 

morphisms. The monoid of cube-free endomorphisms on an alphabet A contain- 
ing at least two letters is never finitely generated. 

2 Preliminaries 

In this section, we recall and introduce some basic notions on words and mor- 
phisms. 

Let A be an alphabet, that is a finite non-empty set of abstract symbols 
called letters. The Cardinal of A, i.e., the number of elements of A, is denoted 
by Card(A). A word over A is a finite sequence of letters from A. The set of the 
words over A equipped with the concatenation of words and completed with a 
neutral element e called empty word is a free monoid denoted by A*. 
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Let u = ai 02 a„ be a word over A, with Ui € A (1 < i < n). The number 

n of letters of u is called its length and is denoted by |m|. Observe |£| = 0. When 
n > 1 the mirror image of u, denoted by u, is the word u = a„....a 2 ai. In the 
particular case of the empty word, e = e. Given a set of words X, we denote by 
X the set of all the words w with w in X. 

A word u is a factor of a word u if u = V\UV 2 for some words V\,V 2 - If v\ = e, 
It is a prefix of v. If U 2 = £, u is a suffix of v. Given a set of words X, Fact(A) 
denotes the set of all the factors of the words in X. 

Let us consider a non-empty word w and its letter-decomposition Xi . . .Xn- 
For any integers i,j,^<i<j< n, we denote by the factor Xi . . . Xj of 

w. We extend this notation when i > j: In this case, = e. We abbreviate 

wji j] in This notation denotes the ***' letter of w. 

For an integer n > 2, we denote by u” the concatenation of n occurrences 
of the word u, = e and = u. In particular, a cube is a word of the form 

with u e. A word w contains a cube if at least one of its factors is a cube. 
A word is called cube-free, if it does not contain any cube as a factor. A set of 
cube-free words is said cube-free. 

A morphism f from an alphabet A to another alphabet S is a mapping from 
A* to B* such that given any words u and v over A, we have f(uv) = f(u)f{v). 
When B = A, f is called an endomorphism on A. When B has no importance, 
we say that / is defined on A or that / is a morphism on A (this does not 
mean that / is an endomorphism). Observe that for a morphism / on A, we 
necessary get f{e) = e, and / is uniquely defined by the values of f{x) for all x 
in A. The Identity endomorphism (resp. the Empty morphism) on A, denoted 
Id (resp. e) is defined by Id{x) = x (resp. e{x) = £), for all x in A. When A is 
a binary alphabet {a, b}, the Exchange endomorphism E is defined by E{a) = b 
and E{b) = a. If A is a set of words, f{X) denotes the set of all the images of 
the words in X . 

A morphism / on A is called cube-free if for every cube-free word w over 
A, f{w) is cube-free. The morphisms Id, e and (on a binary alphabet) E are 
obviously cube-free. Let us remark that if two composable morphisms / and g are 
cube-free then f o g is cube- free (where o denotes the composition of functions). 
Thus the set of cube-free endomorphisms on a given alphabet is a monoid. It is 
also easy to verify that a morphism / on {a, 6} is cube-free if and only if / o if 
is cube-free. Given a morphism / on A, the mirror morphism / of / is defined 

for all words w over A, by f{w) = f{w). Since / = /, / is cube-free if and 
only if / is cube-free. One can see that for a non-empty cube-free morphism /, 
given two letters x and y, we cannot have f{x) prefix nor suffix of f{y) (such a 
morphism / is called biprefix) else f{xxyx) or f{xyxx) contains the cube (f{x))^. 
In particular, for any cube-free morphism different from the empty morphism, 
f{x) £ for each letter x. We will use the following theorem: 



Theorem 1. Given two alphabets A and B with card{A) > 2, card{B) > 2, 
there exist some cube- free morphisms from A to B. 
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If the image of some word by a morphism contains a cube, we often want to 
consider the exact factor whose image contains this cube. Given a morphism /, 
we say that the image of a non-empty word w is a minimal cover of a cube u^, 
if (resp. /(w[|u,|]) has a prefix p yf /(w[i]) (resp. a suffix s yf /(w[|u,|])) 

such that f{w) = puuus. The image of a word contains a cube if and only if the 
image of one of its factors is a minimal cover of this cube. 

3 About Monoids of Cube-Free Endomorphisms 

The set of cube-free endomorphisms on a given alphabet is a monoid. The aim 
of this section is to prove the following result: 

Theorem 2. The monoid of cube- free endomorphisms on an alphabet A con- 
taining at least two letters is not finitely generated. 

Given an alphabet A, we denote by CFa the monoid of cube-free endomor- 
phisms on A different from the empty morphism e. A set of generators of CFa is 
a subset G of CFa such that any morphism / in CFa has a finite decomposition 
over G. Theorem |2| says that CFa has no finite set of generators. In order to 
prove it, we first observe: 

Lemma 1. Given a morphism f : {a, 6}* — *■ {a, 6}* such that |/(a)| = 1, f is 
cube- free if and only if f = Id or f = E. 

Proof. The identity and exchange morphisms are cube-free. Let / be a cube-free 
morphism with |/(a)| = 1, and assume / is different from the identity and the 
exchange morphisms. Since a morphism / is cube-free if and only if if o / is 
cube-free, without loosing generality, we can assume /(a) = a. We have f{b) y^ e 
since otherwise f{aaba) = aaa. Since / yf Id, f{b) y^ h. Observe that f{b) can 
not start nor end with a. Otherwise, f{aab) or f(baa) respectively contains aaa 
as prefix or suffix. Let f(b) = bub. The word bu cannot end with b otherwise f(bb) 
contains 666. In the same way, ub does not start with 6. The word bub does not 
start with bab, otherwise f{bab) contains ababab. In the same way bub does not 
end with bab. Thus bub starts and ends with baab. This is a final contradiction 
since in this case f(baab) contains the cube baabaabaa. q 

We now define a particular family of cube-free morphisms. Gonsidering a 
word V over {6,c} such that cVc6 is cube-free we define the morphism 



{ 



/v(a) = a 
/v(6) = bacV c 



One can verify that 



Lemma 2. The morphism fy is cube-free. 
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We are now able to sketch the proof of Theorem |3 

First, we consider the case Card(A) = 2. Let A = {a, b} and assume CF^a,b} 
is finitely generated. Let G be a finite set of generators of this monoid. From 
Lemma ^ any morphism f in G different from E and Id verifies |/(a)| > 2 and 
|/(^)| > 2. Any cube- free morphism different from E and Id can be decomposed 
over G as f = g\o g20 ... gn with gi G G\ {Id} in such a way that gi = E implies 
gi+i 7 ^ E. Thus, if n > 6, |/(a)| > 8. Consequently, since G is finite, there is a 
finite number of cube-free morphisms with |/(a)| = 6. 

Since there exist infinite cube-free words over a two- letter alphabet {b,c} 
(see, for instance I19I20I L and since in such a word, the factor cb occurs infinitely 
often, there exist arbitrarily large words V over {b,c} such that cV cb is a cube- 
free word. Thus, using LemmaEl we have a contradiction. Indeed, there exist an 
infinity of cube-free morphisms (from {a, b}* onto itself) go fy with \go fy(a) \ = 
6, where g is the cube-free morphism (see |E|) 

{ g(a) = aababb, 
g{b) = aabbab, 
g{c) = abbaab. 

Let us now consider the case Card(A) > 3. Let a and b be two particular let- 
ters of A. We consider the submonoid S of GEa of the cube- free endomorphisms 
f on A such that f{a),f{b) are in {a,b}* and for all letters a; in A \ {a,b}, 
f{x) = X. Any morphism / of GF^^j can be extended into a morphism of 
S taking f{x) = x for x in A \ {a, 5}. Conversely, any morphism of S can be 
projected on a morphism of CF^a.b}- Consequently, one can see that S finitely 
generated implies GF^^ ;,} finitely generated. Thus S is not finitely generated. 

We give now the main ideas to prove that if GFa is finitely generated then 
S is also finitely generated. This ends the proof of Theorem 0 

Let us recall that a permutation of the alphabet A is a bijective endomor- 
phism p on A such that for all x in A, |p(a;)| = 1. We denote by p~^ the inverse 
permutation (such that p o p~^ = p~^ o p = Id). Observe that a permutation is 
a cube-free morphism. We can prove: 

Fact. Given a morphism / in S, and two morphisms g and h in GFa, 
ii f = g o h, then there exists a permutation p of A such that pop and 
p~^ oh belong to S. 

Now assume that G is a finite set of generators of GFa- Given a morphism 
/ in S, since S C GFa, there exist some morphisms pi, . . . ,p„ in G such that 
/ = Pi o p2 o . . . o p„. From the previous fact, there exist some permutations 
Pi,P2, . . . ,Pn,Pn+i with Pi = Pn+i = Id such that for all i (1 < i < n), p~^ o 
gi o pi_|_i G S. We have: 

/ = (Pi^ °9i °P2) o {p2^ op2 opa) o . . . o (p;;;^ op„ op„+i). 



Since there is a finite number of permutations of A, S is finitely generated by 
the morphisms of S of the form p o p o p with p, q two permutations of A and 

9 &G. □ 
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4 Test-Sets 

In this section, we are interested in test-sets for cube-free morphisms. Let A, B 
be two alphabets. If Card(i?) = I, the only cube-free morphism from A to B 
is the empty morphism e (and Id if Card(A) = 1). In the rest of the paper, we 
assume that Card(S) > 2. 

A set of words T is a test-set for cube-free morphisms from A to B if, given 
any morphism / from A to B, / is cube-free if and only if f(T) is cube-free. 

We first examine the case Card(A) > 3. 

Theorem 3. Given two alphabets A and B with Card{A) > 3 and Card{B) > 2, 
there is no finite test-set for cube- free morphisms from A to B. 

Theorem 0 is a corollary of Proposition Q] below. Let A be an alphabet con- 
taining at least two letters. We consider one particular letter a in A, and three 
different letters x,y,z which do not belong to A\{a}. Let C = A\ {a} U{a;, y, 2 }, 
and let u and v be two cube-free words over A\{o} non simultaneously empty. 
We define the morphism •. A* ^ C* by: 

/ fu,v{o) = xzyuxyvxzyuxyvxzy 
\ fu,v\b) = b for all 6 in A \ {a} 

We have (see Section 0for the definition of cover): 

Proposition 1. Let w be a cube-free word over A. fu,v{w) is a minimal cover 
of a cube if and only if w = avaua. 

By lack of place, we leave it to the reader to verify this proposition. 

Proof of Theorem 0 Consider a particular letter a in A. Let C = A \ {a} U 
{x,y,z} where x,y,z are three different letters which do not belong to A \ {a}. 
Since Card(A) > 3 and Card(i?) > 2, using Theorem P there exists a cube-free 
morphism g from C to B. From Proposition ^ given two words u and v over 
A \ {a} non simultaneously empty, we know that g o fu,v{w) is a minimal cover 
of a cube if and only ii w = avaua. Thus avaua belongs to the set of factors of 
any test-set for cube-free morphisms from A to B. Since there is an infinity of 
cube-free words over A \ {a}, any test-set for cube- free morphisms from A to i? 
is infinite. q 

From now, we will assume Card(A) = 2, for instance A = {a,b}. Cube- 
free words u and v over {5} can take only three values e, b and bb. So, using 
similar techniques as in the proof of Theorem 0 and considering and fu,v ° 
E for {u,v) in {{bb,bb),{b,bb),{bb,b),{e,bb),{bb,e),{b,b)}, we get, as another 
consequence of Proposition ^ that for any test-set T for cube-free morphisms 
from {o, 6} to an alphabet of cardinal at least 2, the set of factors of T contains: 

^ ( abbabba, baabaab, ababba, babaab, abbaba, baabab, 

1 aabba, bbaab, abbaa, baabb, ababa, babab 
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The converse also holds and gives a characterization of test-sets for cube-free 
morphisms on a two-letter alphabet: 

Theorem 4. A subset T of {a, b}* is a test-set for eube-free morphisms from 
{a, 5} to an alphabet of cardinal at least two if and only if T is cube-free and 
Tmin C Fact(T). 

Before giving some ideas on the proof of this theorem, let us examine some 
particular test-sets. Obviously, the set Tmin is one of them. Since it contains 
twelve elements, one can ask for a test-set of minimal cardinality. There exist 
some test- words (that is a test-set of cardinal 1). For instance, one can verify that 
the cube-free word aabbababbabbaabaababaabb is one of the 56 words of length 
24 that fulfills the conditions of Theorem E] and thus, is a test-set for cube-free 
morphisms on {a, 5}. The length of this word is optimal: No cube-free word of 
length 23 contains all the words of Tmin as factors. 

Another direct corollary of Theorem E] is the following unpublished result of 
Leconte dU: 

Corollary 1. Given a morphism f on a binary alphabet, the following assertions 
are equivalent: 

1. f is cube- free. 

2. The images of all cube-free words of length 7 are cube-free. 

3. The images of all cube-free words of length at most 7 are cube-free. 

About the works of Leconte in, let us also mention that he used the mor- 
phism fbb,bb to show the optimality of the bound 7 in Corollary^ To prove this 
corollary from Theorem E| one just has to observe that each word of T^nin is a 
factor of a cube-free word of length 7. 

Since the Identity morphism is cube-free, any test-set for cube-free morphisms 
is necessarily cube-free. So to prove Theorem El it is enough to prove that if 
Tmin C Fact(T) for a set T of cube-free words, then T is a test-set for cube-free 
morphisms. Denoting by Gs (resp. Lj) the set of all cube-free words over {a,b} 
of length at least 8 (resp. at most 7), we cut the proof of Theorem El into two 
parts (in the two following propositions / is a morphism defined on {a, 6}): 

Proposition 2. Given any word w of Gg,, if /(Tjnin) is cube-free then f{w) is 
not a minimal cover of a cube. 

Proposition 3. If f{T^in) is cube-free then f(Ly) is cube-free. 

Proposition O means that if /(Tmin) is cube-free, and if, for a cube-free word 
w, f{w) contains a cube, a factor of w which is in Lr\ Fact(Tmi„) is necessarily 
a minimal cover of this cube. Proposition El then proves that such a situation 
is impossible. Since {a,b}* = Lj U Gg, from these two propositions, /(Tjnin) 
cube-free implies / cube-free. Thus if T is cube-free and Tmin C Fact(T), T is a 
test-set for cube- free morphisms. This ends the proof of Theorem El 

In the next two subsections, we give the main ideas of the proofs of Propo- 
sitions El and 0 
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4.1 Ideas of the Proof of Proposition |2] 

Here, we only give the scheme of the proof (rather technical) of Proposition 0 
By contradiction, we prove that if the image of a word of Gg is a minimal cover 
of a cube then at least one word of /(T„iin) contains a cube. For this, we need 
to study more precisely the decomposition of a cube when f{w) is a minimal 
cover of it, for some morphism / on {a, 6} and some cube-free word w. 

Note that up to the end of the section, we denote by n the length of w. 

By definition, the image of a word w is a minimal cover of a cube uuu if and 
only if f{w) = piuuuSn with /(w[i]) = piSi, /(w[n]) = PnSn, for some words pi, 
Si ^ e, Pn ^ s and Sn- In this case there exist two integers i and j between 1 
and n such that |/(w[i„j_i])| < \piu\ < |/(w[i..i])| and |/(w[i..j_i])| < \piuu\ < 
\f{w[i..j])\ (remember ui[i„o] = e). 

In the general case, we may have i = l,i = j or i = n. In the proof of 
Proposition 0 we will see that when the image of a cube-free word of length 
at least eight is a minimal cover of a cube, then we necessarily have 1 < i < 
j < n. In this case, there exist some words Pi,Pj, Si, Sj such that /(w[ij) = PiSi, 

= PjSj, U = Slf{w[2..^-l])pi = sj{w[i+i„j_i])pj = Sjf{wij+i„n-l])Pn- 

Since |/('!C[i < \p\u\ and |/(w[i < \p\uu\ , one can observe that pi 

and pj are non-empty words. But we may have = £ or 

= e, i.e., i = 2^j = i + lorn = j + l. The previous situation can be 
summed up by Figure 0 



/(W[,]) /(W[,.j) /(W[.j) /(W[„,) 

P, ]) P, Pi 

/(W) = I — ^ ^ 1 — ^ ^ ^ 1 — I 1 1 1 

^fce 



u 



u 



u 



Fig. 1. Decomposition of a cube 



In each situation of the following lemma, we place ourself in the situation of 
Figured Adding some hypotheses, we deduce that /(Tmin) is not cube-free. 

Lemma 3. Consider f an injective morphism on {a, b}, w a cube-free word of 
length n > 6 and integers i,j such that the situation described in Figure^ is 
verified. In each of the following cases, /(Tmin) contains a cube: 

1. W[j+i] = W[i] yf W[j] = W[i+i], with l<i<j — l<n — 2. 

2. W[j_i] = W[i] yf wy] = with 2<i<j— l<n— 1. 

3. W[i] = W[j] and 1 < i < j < n. 

The proof of Lemma 0 can be done using the following property which has 
its own interest: 
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Property 1. Consider a morphism / on {a, b}, a letter c in {a, b}, and two words 
X and y respectively prefix and suffix of the images of two cube-free words over 
{a, 6} by /. If f{c)x = yf{c) with 0 < |x| < |/(c)| then /(Tmin) is not cube-free. 

The proof of Proposition 0 now consists in seven successive steps (by lack of 
place, we do not go further on details): 

Step 1. We recall the hypotheses. We consider a morphism / on {a, 5}, a non- 
empty word u and a cube-free word w such that the length n of w is at least 8, 
and f{w) is a minimal cover of the cube uuu. We define the words pi, si ^ 
£,Pn,s„ yf e such that f{w) = piuuusn, = piSi, /(w[„]) = p„s„. 

We also define the integers i and j such that 1 < i < j < n, |/(u>[i..i_i])| < 
\piu\ < \f{w[i..i])\ and |/(w[i„y_i])| < \piuu\ < |/(w[i..j])|. We have to prove 
that /(Tmin) contains a cube. 

We may assume that / is injective otherwise f(aabaa) or f(bbabb) is not 
cube-free, or /(a) = f{b) = e and / is cube-free. 

Step 2. We prove that we are in the situation of Figure ^ that is 1 < f < 
j < n. We use length reason to prove i yf 1, * yf j and j yf n. For in- 
stance, if J = 1, we get |u| < \piu\ < |/(w[i])| and \uuf{w[n])\ > > 

l/(w^[ 2 ..n-i] )/(w^[n]) I which implies |uu| > |/(w[ 2 ,.n-i])l- We have |w[ 2 ,,„_i]| > 
6 and rc[ 2 ..n-i] cube-free. Such a word contains at least two as and two 6s. 
It follows that |u| > |/(a)| -I- |/(6)|: A contradiction with |u| < |/(w[i])|. 
Step 3. Since the case = wy-^ is treated by Lemma El (/(2j„in) contains a 
cube), we assume W[i] yf wy]. 

Step 4. Using length reason, we prove i ^ j — 1. 

Step 5. We prove that we cannot have simultaneously i = 2 and j = n — 1. 
Step 6. In case i = 2 (similarly j = n — I), we prove that, in cases there is no 
contradiction, /(Tmin) contains a cube. 

Step 7. In case i ^ 2 and j y^ n — 1, we show also that /(Tmin) is not cube-free. 

4.2 Ideas of the Proof of Proposition El 

To prove Proposition El we again use LemmaB Moreover, we consider one new 
set: 

•Scomp = {aabaaba, aabaabb, aababaa, aababba, aabbaab, aabbaba, abaabab, 
abaabba, ababaab, aababa, aababb, aabbaa, aabbab, abaaba, abbaab}. 

The main interest of 5comp is that: Ly \ Fact(T„iin) = S'comp U 5comp U 
E{Scomp) U if (S'comp)- To prove Proposition El we have to show that if the 
image by a morphism / of a word in Ly \ Fact(rniin) is a minimal cover of a 
cube then at least one word in /(Tmi„) contains a cube. But, for any word w in 
ScompUScompUif(Scomp)Uif(Scomp), One of the words w, w, E{w) or E{w) is in 
Scomp, and if f{w) is a minimal cover of a cube, then /(w), (/ o E){E{w)) and 
{f o E){E{w)) also are minimal covers of a cube. But Tmin = if(7min) = Fmin- It 
is thus sufficient to prove that for any word w in Scomp, f(w) is a minimal cover 
of a cube implies /(Tmin) non cube-free. 



108 



Gwenael Richomme and Francis Wlazinski 



Let w be a word in S'comp and let / be a morphism on {a,b}. Assume that 
f{w) is a minimal cover of a cube u^, i.e., f{w) = piuuusn with /(w[i]) = piSi, 
/(w[„]) = PnSn, for some words pi, s\ ^ e, Pn ^ s and And let z,j the 
integers such that |/(w[i„j_i])| < |piu| < |/(w[i..i])| and |/(w[i„j_i])| < \piuu\ < 

If i = 1, we have \u\ < |/(w[i])|. Moreover 2\u\ > |/(w[ 2 ,,„_i])|. But, for each 
word in S'comp, W[ 2 ..n-i] contains at least two W[i]. Thus we can not have i = 1. 
In the same way, we get j ^ n. One can also prove i ^ j ■ 

From now on, we are in the situation of Figure ^ We adopt its notations, that 
is, f{w[{\) = PiSi, = pjSj and u = sif{w[ 2 ..i-i])Pi = s^f{w[i+l„j_l])pj = 

Sjf(wy+i,,n-i])Pn where si,pi,pj and Pn are non-empty words. We consider all 
the 3-uples (w,i,j) with w in Scomp and i,j some integers such that 1 < i < 
j < |r<;|. We show that each configuration is impossible or implies that /(Tmin) 
is not cube- free. 

Some configurations lead to immediate results. In particular, this is true 
for configurations that verify one case of Lemma 01 One configuration such 
that i = 2 , ^ wp], W [3 j_i] or wp+i „_ij contains at least one a and 

one 6 is impossible. Indeed, in this case, |u| < |/(w[i.. 2 ])| = l/(®^)l and |u| > 
max{|/(r (;[3 |/(i(;[j+i..„_i])|} > \f{ab)\. For the same reason, we cannot 

have “i = j — 1, wpj u>p], W[ 2 .a-i] or wp+i. „,_i] contains at least one a and 
one b” and “j = n — 1, wpj Xn, W[ 2 ..i-i\ or wp_|_i y_ij contains at least one a 
and one 6” . 

After the elimination of the 3-uples for which one of the previous cases is ver- 
ified, it keeps ten configurations to study (among 126): {aabaabb, 2, 6), {aababb, 
2, 5), {aababba, 2, 5), {aababba, 3, 4), {ababaab, 3, 4), (abbab, 2, 4), {abbaab, 3, 
5), {aababa, 3, 4), (aababb, 3, 4) and (abbaab, 3, 4). In each of these cases, we 
obtain by similar techniques as described in Part R. II a contradiction or the fact 
that /(Tmin) is cube-free. 

Acknowledgments. The authors would like to thank P. Seebold for his 
encouragements during these works, and for all his helpful remarks. 
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Abstract. We investigate a finite state analog of subband coding, based 
on linear Cellular Automata with multiple state variables. We show that 
such a CA is injective (surjective) if and only if the determinant of its 
transition matrix is an injective (surjective, respectively) single variable 
automaton. We prove that in the one-dimensional case every injective au- 
tomaton can be factored into a sequence of elementary automata, defined 
by elementary transition matrices. Finally, we investigate the factoring 
problem in higher dimensional spaces. 



1 Introduction 

Consider the frequently encountered task of encoding a discrete time signal 
. . . X-i, xq,xi, . . . where the sample values Xi are real numbers. Let us trans- 
form the samples Xi by applying on each site i a local linear function /, i.e., the 
transformed signal will be . . . y-i,yo,yi, . . . where for every i G Z 

Ui ~ ■ ■ ■ 5 fi+r) 

for some positive integer r and a linear function / : — *■ E. The local 

function / is a called a finite impulse response (FIR) filter, and the same filter 
/ is used on all sites i. In order to use the transformed signal as an encoding of 
the original signal the global transformation has to be a one-to-one mapping. In 
signal processing literature this is known as the perfect reconstruction condition. 
It is easy to see that the only choices of linear / that have the perfect reconstruc- 
tion property are functions that shift and/or scale the signal by a multiplicative 
constant. 

In order to obtain non-trivial transforms one can relax the requirement that 
the same local function / is applied everywhere. Instead, let us use two linear 
functions / and g and let us apply / on even sites X 2 i and g on odd sites X 2 i+i- 
Then the transformed signal . . .y-i,yo,yi, . . . satisfies 

_ f f{xi-r, ■ ■ ■ , fi+r), for even i, and 
\g{x^-r,.■.,f^+r), for odd z. 

* Research supported by NSF Grant OCR 97-33101 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 1 10- 17771 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 



Linear Cellular Automata with Multiple State Variables 



111 



This approach is called subband coding. Even and odd samples form two sub- 
bands of the original signal. Suitable choices of / and g lead to perfect re- 
construction transformations. Simple conditions on / and g are known for the 
perfect reconstruction condition, as well as for other desired properties such as 
the orthonormality of the global transformation. In compression applications the 
functions / and g are designed in such a way that if the input signal is ’’smooth” 
then most of the information and signal energy is packed in the even subband, 
while the odd subband contains little information and can be heavily compressed 
using entropy coding. The subband transformation can be repeated over the even 
subband, and further iterated using the even subband of the previous level as 
the input to the next level. 

Digital signal processing literature contains detailed studies of subband cod- 
ing of speech and images. State of the art wavelet compression algorithms are 
based on clever encodings of the subbands. The inherent multiresolution rep- 
resentation allows embedded encodings of signals, i.e., encodings where every 
prefix of a single compressed bit stream provides an approximation of the sig- 
nal, and the quality of the approximation improves as the length of the prefix 
increases. 

In this work we consider the analogous problem of encoding an infinite se- 
quence . . .X-i,Xo,Xi, . . . using locally defined linear functions. However, we are 
interested in coding binary sequences, or more generally, sequences over a finite 
alphabet. For the sake of linearity the alphabet is assumed to have the algebraic 
structure of a commutative ring with identity. Potential applications include 
compression of bilevel images, or graphics containing a few colors. Transforma- 
tions that apply the same local rule at all sites are known to computer scientists 
as cellular automata (CA). The perfect reconstruction condition is equivalent to 
the reversibility condition of CA. Reversible linear cellular automata have been 
studied in the past iiimiT] and — analogously to the real valued case — they 
are of little use in compression applications. In the special case of linear CA 
over a finite field the only reversible rules are shifts and/or multiplications by a 
non-zero constant. 

Following the example set by subband coding we generalize the notion of 
linear CA by allowing different local functions on odd and even lattice points, 
or more generally, m different local functions applied on m subbands. The same 
definition is more elegantly captured in the notion of a vector valued cellular 
automaton where each cell contains m components, updated according to a linear 
local rule. 

Definitions of linear CA and multiband linear CA are given in Sections El and 
El In Section 01 we establish necessary and sufficient conditions for the perfect 
reconstruction (that is, the injectivity) property, as well as for the surjectivity 
property. In Section 0 we investigate the problem of factoring injective rules into 
elementary components. This is important in order to be able to efficiently find 
injective rules with desired properties. The elementary rules are also natural 
from the compression point of view as they correspond to computing predic- 
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tions/prediction errors between different sublattices. In Sectional we discuss the 
problems that arise in higher dimensional cellular spaces. 



2 Linear Cellular Automata 

Cellular Automata (CA) are discrete time dynamical systems consisting of an 
infinite regular lattice of cells. Cells are finite state machines that change their 
states synchronously according to a local rule / that specifies the new state of a 
cell as a function of the old states of some neighboring cells. 

More precisely, let us consider a Z 3 -dimensional CA, with a finite state set 
Q. The cells are positioned at the integer lattice points of the Z 3 -dimensional 
Euclidean space, indexed by The global state - or the configuration - of the 
system at any given time is a function c : > Q that provides the states of 

all cells. Let Cq denote the set of all Z?-dimensional configurations over state 
set Q. 

The neighborhood vector N = {vi,V2, ■ ■ ■ ,Vn) of the C A specifies the relative 
locations of the neighbors of the cells: each cell x G Z^ has n neighbors, in 
positions x + Vi for i = 1 , 2 , . . . ,n. The local rule / : Q” — Q determines the 
global dynamics F : Cq — s- Cq as follows: For every c G Cq and x G Z^ we 
have 

F(c)(x) = / [c(s + Vi),c{x + V2), . . . c(x + v„)] . 

Function F is called a (global) CA function. 

A Cellular Automaton is called linear (or additive) if its state set Q is a finite 
commutative ring with identity, usually the ring Zm of integers modulo m, and 
the local rule / is a linear function f{qi, q2, ■ ■ ■ , qn) = ai<Zi + a2?2 + . . . + 
where oi, 02, . . . , a„ are some constants from the ring Q. Linearity simplifies the 
analysis of the global function F, and many properties that are undecidable for 
general CA have polynomial time algorithms in the linear case. Throughout this 
paper, when Q is said to be a ring, we automatically assume that Q is a finite 
commutative ring with identity element 1 . 

A useful representation of linear local rules uses Laurent polynomials. For 
simplicity, consider the one-dimensional case D = 1 first - the higher dimensional 
cases are considered later in Section 0 Local rule 

f{qi,q2, ...,q„) = aiqi + 02172 + . . . + a„qn 

with neighborhood N = (ui, U2, . . . , u„) is represented as the Laurent polynomial 

p{x) = aix~^^ + Q2X~^^ -|- . . . -I- anX~'^" . 

The polynomial is Laurent because both negative and positive powers of x are 
possible. Notice that if Laurent polynomials p{x) and q{x) over ring Q define 
global functions F and G, respectively, then the product p{x)q{x) represents the 
composition F oG. Consequently, for any A: > 1 , p^{x) represents F^ . Also, it is 
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useful to represent configurations as Laurent power series: Let one-dimensional 
configuration c correspond to the formal power series 

OO 

s(a;) = 

1— — 00 

Then the product p{x)s{x) represents the configuration F{c), and p^{x)s{x) 
represents F^{c). 

Let us denote the set of Laurent polynomials over Q by Q[x,x~^\. The set 
Q[x,x~^\ itself is an (infinite) commutative ring. Let Q\\x,x~^\] denote the set 
of Laurent series over Q. Notice that elements of can be added but 

not multiplied with each other. The product of a Laurent series and a Laurent 
polynomial is well defined. 

It is easy to see that a CA function F is linear, i.e., defined by some Laurent 
polynomial p{x), if and only if it is a linear transformation of i.e., if 

and only if F{c+ d) = F{c) + F{d) and F{a ■ c) = aF{c) for all configurations c 
and d, and all a G Q |21 . 

Classical results concerning Cellular Automata address the injectivity and 
surjectivity problems. A CA in called injective if its global function F is one-to- 
one and surjective if the global function is onto. The following basic properties 
are valid in any dimension D, and they are valid for unrestricted CA, that is, 
not only in the linear case: 

1. Every injective CA is also surjective I0E|. 

2. If E is an injective CA function then its inverse F~^ is also a CA func- 
tion mni, computed by the inverse cellular automaton. Sometimes injective 
CA are called invertible or reversible. 

3. A CA function F is surjective if and only if F is injective on the set of finite 
configurations cm . (A configuration c is called finite w.r.t. state q if only 
a finite number of cells are in states different from q. The statement is valid 
for any choice of g G Q.) 

In the unrestricted case it is difficult to characterize local rules that make the 
CA injective or surjective. If the space is at least two-dimensional then it is 
even undecidable whether a given local rule defines an injective or surjective 
dynamics 0. (In the one-dimensional case decision algorithms exist 0.) 

Linearity simplifies the analysis. The inverse of a linear injective function is 
also a linear function, so p{x) defines an injective CA if and only if there exists 
a Laurent polynomial q{x) such that p{x)q{x) = 1. Such q{x) is the local rule 
of the inverse automaton. And based on property 3, p{x) defines a surjective 
CA if and only if there does not exist a Laurent polynomial q{x) ^ 0 such that 
p{x)q{x) = 0. Such q{x) would namely represent a finite configuration c such 
that F{c) = 0 = E(0). In other words, the linear CA represented by Laurent 
polynomial p{x) is injective (non-surjective) if and only if p{x) is a unit (a zero 
divisor, respectively) of the ring Q[x,x~"^]. 

It turns out that inclusions of the coefficients of p{x) in the maximal ideals 
of ring Q determines the injectivity and surjectivity status of the CA, as proved 
in [T7j : 
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Proposition 1 . (Sato 1993) The linear CA represented by polynomial p{x) over 

ring Q is 

(a) surjective if and only if no maximal ideal of Q contains all coefficients of 
p{x). 

(b) injective if and only if for every maximal ideal exactly one coefficient of p{x) 
is outside the ideal. 

In the special case Q = "Lm obtain the following well-known result 0: 

Corollary 1 . (Ito et. al. 1983) The linear CA represented by polynomial aix'"^ + 

02 a ; '"2 -I- . . . ttnX^’^ over ring Zm is 

(a) surjective if and only gcd(m, oi, 02 , . . . , o„) = 1. 

(b) injective if and only if every prime factor p of m divides all but exactly one 
coefficient oi , 02 , . . . , o„ . 

The conditions of Proposition Q can be rephrased in the following easy-to-check 

characterizations of injective and surjective linear rules: 

Proposition 2 . The linear CA represented by polynomial p{x) over ring Q is 

(a) surjective if and only if a ■ p{x) 0 for every a G Q \ {0}, 

(b) injective if and only if for every a G Q \ {0} there exists b G Q such that 
ab ■ p{x) is a monomial. 



3 Multiband Linear Rules 

In the previous section we considered linear Cellular Automata over ring Q, 
where each cell used the same linear local rule. Let us generalize the situation by 
allowing different linear rules on even and odd numbered cells. In this section we 
analyze such CA — and the more general case of m different local rules applied 
in different positions modulo m — and we provide algorithms to determine the 
injectivity and the surjectivity status of any such automaton. 

Let pi (x) and p2 (x) be the Laurent polynomials representing the local rules 
on even and odd cells, respectively. Let us separate the even and odd powers of 
X in the polynomials, i.e., let us find Laurent polynomials pu{x), pi2{x), P2i(x) 
and P22 (x) such that 

Pi{x) = pii{x'^) + X ■ pi2{x‘^), and 
P 2 {x) = X ■ P2l(x'^) + P22{x^)- 

Notice that pn{x) and ^ 12 ( 2 :) represent the contributions of even and odd cells 
to even cells, respectively, and P 2 i{x) and P 22 {x) represent the contributions of 
even and odd cells to odd cells, respectively. So if ci(a;) and C 2 (x) are Laurent 
power series representing the states of even and odd cells at any given time, then 
series 

Pii{x)ci{x) + pi2{x)c2{x), and 
P 2 i(a:)ci(a;) + P 22 ix)c 2 (x) 
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represent the states of even and odd cells one time step later. Using matrix 
notation, the new configuration is represented by 

f Pii(x) Pi2(x)\ f ci(a;)\ 

\P2i{x) P22{x) ) VC2(a;) ) ■ 

This leads to the following definition of linear CA with multiple state variables. 
Let us combine blocks of two cells into single ’’super” cells. Then each cell con- 
tains two elements of the ring Q, and both elements are updated according 
to some local linear rules. The two elements may have different rules, but all 
’’super” cells are identical. 

More generally, let us allow every cell to store m elements of the ring Q, so 
the states are m-tuples (gi, q 2 , . ■ . , Qm) G Q™- Let us extend the Laurent power 
series notation for configurations. A configuration c consists of m Laurent power 
series Ci{x) over ring Q, organized as a column vector of size m: 



c{x) 



( Cl{x) \ 

C2(x) 



\cm(x) / 



Series Ci{x) gives the states of the i’th state variables of all cells, and we call it 
the z’th band of the configuration. 

The local rule is a linear function. It is defined by an m x m square matrix 
of Laurent polynomials: 



A{x) 



/ Pll(x) Pl2(x) ... Plm{x) \ 

P2l{x) P22{x) ■ ■ ■ P2m{x) 



\Pnil{x) Pm2{x) ■ ■ ■ Pmm{x) ) 



We call this the transition matrix of the automaton. Laurent polynomial Pij(x) 
gives the contribution of the j’th band to the z’th band. Analogously to the 
single variable case, the result of applying CA A{x) to configuration c(x) is 
configuration A(x)c{x). Composition of two CA is then given by the product of 
their transition matrices, and the k'th iterate of CA A(x) on initial configuration 
c{x) is A’^(x) ■ c{x). All products are standard matrix products over matrices 
whose elements are Laurent polynomials and series. We call such CA linear m- 
band CA over ring Q, and we use the abbreviation m-band LCA. In the single 
band case m = 1 the definition is identical to the normal linear CA over Q. 

Notice that a CA function F is an m-band LCA if and only if it is a lin- 
ear function on the set Q\\x,x~^]]”^ of configurations: F{c+ d) = F{c) + F{d) 
and F{a ■ c) = aF{c) for all configurations c and d and ring element a € Q. 
Consequently, the inverse CA of any injective m-band LCA is also an m-band 
LCA. 

The following proposition reduces the injectivity and surjectivity questions 
of m-band CA into the well understood single band case: 
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Proposition 3. An m-band LCA over ring Q is injective (surjective) if and only 
if the determinant of its transition matrix is an injective (surjective, respectively) 
one-band LCA over Q. 

Proof. Let ^(a;) be the transition matrix of an m-band LCA. 

1. Injectivity: The CA defined by A{x) is injective if and only if A{x) is an 
invertible matrix, that is, if and only if there exits a transition matrix B(x) (of 
the inverse automaton) such that A{x)B{x) = /. A matrix over a commutative 
ring is invertible if and only if its determinant is a unit of the ring (see for 
example mi, Theorem 50). Notice that the elements of transition matrices are 
elements of the commutative ring Q[x, So the CA is invertible if and only if 
det A{x) is an invertible element of Q[x, that is, if and only if the one-band 
LCA defined by det A(a;) is injective. 

2. Surjectivity: The CA defined by A{x) is surjective if and only if A{x)B{x) 0 

for every transition matrix B(x) yf 0. This is equivalent to A(x) not being a zero 
divisor because any square matrix over a commutative ring is a left zero divisor 
if and only if it is a right zero divisor. On the other hand, a square matrix is 
a zero divisor if and and only if its determinant is a zero divisor (McCoy |l Ij . 
Theorem 51), so A{x) defines a surjective CA if and only if detA(x) is not a 
zero divisor of Q[x, x~^], i.e., if and only if det A(x) defines a surjective one-band 
LCA. □ 

According to the proposition det A{x) is a single band LCA that has the 
same injectivity and surjectivity status as the multiband CA A{x). 

4 Factoring Injective Rules into Elementary Components 

Proposition 0gives a characterization of injective m-band linear CA. However, it 
does not provide a simple method of constructing injective automata, apart from 
trying different matrices one-by-one and checking whether their determinants 
satisfy the condition of Proposition El^b) . In this section we consider simple CA 
rules, called elementary rules, that are trivially invertible and easy to construct. 
Then we show how any injective m-band LCA can be factored into a composition 
of such elementary CA. 

For any Laurent polynomial p{x) G Q[x, x~^] and any i ^ j we define an ele- 
mentary m-band LCA with transition matrix Eij{p{x)) as follows: All diagonal 
elements of Eij{p{x)) are 1, element (i,j) is p{x), and all other elements are 0. 
In other words, the elementary automaton Eij{p{x)) adds to band i the result 
of applying p{x) to band j. This is known as an elementary row operation in 
linear algebra. The automaton is injective, with determinant 1, and the inverse 
of E,j{p{x)) is Eij{-p{x)). 

Because det AH = det A det H, the determinant of any composition of ele- 
mentary CA is 1. We are interested in the opposite direction: Is every matrix 
with determinant I a product of elementary matrices ? It turns out that this 
claim is true. However, unlike other results we have seen so far, this claim does 
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not easily generalize to higher dimensional cellular spaces. This point will be 
discussed in Section 0 

Let us use the following standard notations: 

• (general linear group) GLm(.R) is the set of invertible m x m matrices over 
the ring R. These are the matrices over R whose determinant is a unit of R. 
In our case R = Q\x,x~^] the elements of GLm(^^) are exactly the injective 
m-band LGA. 

• (special linear group) SLm(.R) is the set of m x to matrices over ring R whose 
determinant is 1. 

• Em(.R) are the m x m matrices over R that are products of elementary 
matrices. 

We are interested in determining whether F,m{R) = SLm{R) in the case of 
R — Q[x, x~^\. 

Glearly, if A € GLm(.R) then, for any matrix M whose determinant is 
(detA)“^, we have M~^A G SLm{R)- Matrix M can be as simple as the di- 
agonal matrix whose diagonal entries are 1 except for the first elements which is 
(det A)~^ . Therefore, a factorization of matrices in SLm(^^) into elementary ma- 
trices provides a factorization of matrices in GLm(.R) into products of a diagonal 
matrix and a sequence of elementary matrices. 

Proposition 4. Let R = Q[x,x~^] where Q is a finite commutative ring with 
identity. Then, for every to > 1, SLm{R) = Em{R). 

Proof. Without loss of generality we may assume that Q is a local ring. Every 
finite commutative ring is namely a direct sum of local rings m- 

Let us briefly review some properties of local rings. See IQI for proofs and 
more details. A local ring Q has a unique maximal ideal M , and the quotient 
ring Q/M is a field. Ideal AI consists of the zero-divisors of Q. Every element 
outside M is a unit. Set AI is nilpotent, which means that there exists an integer 
n such that the product of any n elements of Af is 0. 

Let us consider Laurent polynomials p{x) over local ring Q, with maximal 
ideal M. We know that p{x) is invertible iff exactly one coefficient is not in M . 
Let us define the degree of a Laurent polynomial in the following, slightly non- 
standard way. Let us separate the unit and non-unit terms of of the polynomial: 

p{x) = {aix"^ + a2x'"^ -I- ... -I- anx'’^) + (6ia;“^ -I- 622:“^ -I- ... -I- fofcx"'') 

where v\ < V2 < ... < Vn, u\ < U2 < ... < Uk, Vi yf uj for all i and j, 
every ai is a unit and every bi is a non-unit, i.e. an element of M . The degree 
degp(a:) is then — ui if n > 1, and —00 if n = 0. In other words, the non-unit 
terms are ignored in the calculation of the degree. We have the following easily 
verifiable properties: For all p{x) and q{x), degp{x)q{x) = degp(a:) -I- deg (/(a;), 
and if degp(a:) = —00 then deg[p(x) -|-(7(a;)] = deg q{x). Laurent polynomial p{x) 
is invertible iff its degree is 0. 
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The existence of a division algorithm is crucial: Let f{x) and g{x) be Laurent 
polynomials such that degg(a:) > 0. Then there exist Laurent polynomials q{x) 
and r(x) such that 

f{x) = g{x)q{x) +r{x) 

and degr(a:) < degg{x). 

To prove Proposition E] we show that any member of SLm(-R) can be reduced 
to the identity matrix using elementary row and column operations. Because 
elementary row and column operations correspond to multiplying the matrix 
by an elementary matrix from the left and right, respectively, and because the 
inverse of an elementary matrix is elementary, this proves the proposition. 

Let us use mathematical induction on m, the size of the matrices. If m = 1 
the claim is trivial. Assume then the claim has been proved for matrices of 
size (m — 1) x (m — 1) and consider an to x m matrix A(x) = {pij{x)) whose 
determinant is 1. So the degree of the determinant is 0. The determinant is a 
linear expression 



det A(a;) = pn{x)qi{x) + P 2 i{x)q 2 {x) + ■ ■ ■ + Pm.i{x)qm{x) 

of the first column where the coefficients qi{x) are cofactors of the matrix. If all 
the elements pn{x) of the first column would have degree — oo then we would 
have deg[det A(a;)] = — oo. So at least one pn{x) has degree > 0. 

If more than one element on the first column has degree > 0 we can use 
the division algorithm to reduce the degree of one of them: If degpn{x) > 
degpji{x) > 0 then there exist q{x) and p'n{x) such that pn{x) = q{x)pj\{x) + 
p'ii{x) and degp'i(a;) < degpii(a:)- So the degree of the element (j, 1) can be 
reduced using the elementary row operation Eij{—q{x)). This process can be 
repeated until only one element of the first column has degree > 0. 

Once only one element, say pn{x), of the first column has degree > 0, then 

0 = deg[det A(a;)] 

= deg [pii{x)qi{x) +P 2 i{x)q 2 {x) + ... + Pm.i{x)qm.{x)] 

= deg p^l (x) + deg g* (x) . 



We must have degpa(x) = 0, i.e., pu(x) is invertible. If f 2 we add row i to 
row 2 to obtain an invertible element in position (2, 1). Then we can add suitable 
multiples of row 2 to other rows to obtain 1 in position (1, 1), and 0 in positions 
(j, 1) for all j > 2. Finally, using the elementary row operation E 2 i(—p 2 i(x)) we 
reduce also element (2,1) of the matrix to 0. 

Using elementary row operations we were able to transform A(x) into a 
matrix whose first column is the first column of the identity matrix. Using to — 1 
elementary column operations we can then make the elements f > 2, of 



up 


with matrix 




0 ... 0\ 


0 






A'{x) 


0 


/ 
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where A' {x) is an (m — 1) x (m — 1) matrix with determinant 1. According 
to the inductive hypothesis A' {x) can be reduced to the identity matrix using 
elementary row and column operations. □ 



5 Higher Dimensional Cellular Spaces 

The results of Sections |21 and 0 can readily be generalized to higher dimen- 
sional cellular spaces. Linear rules on D-dimensional cellular spaces are rep- 
resented as Laurent polynomials over D variables X\,X 2 , ■ ■ ■ tXd- We will ab- 
breviate X = {x\^X 2 , ■ ■ ■ ,xd)i and denote p{x) if p is a Laurent polynomial of 
variables X\,X 2 ^ ■ ■ ■ ,xd- Let Q\x,x~^] be the ring of Laurent polynomials over 

X\ , X2 5 • ■ ■ j . 

Configurations are Laurent power series over the same variables. Let 
denote the set of Laurent power series over xi,X 2 , ■ ■ ■ ,Xd- Product 
p{x)c(x) represents the configuration F{c) if p{x) represents F and power series 
c{x) represents configuration c. As in the one-dimensional case, set Q\x,x~"^] is 
a commutative ring. 

A D-dimensional m-band LCA is defined analogously to the one-dimensional 
case . It is specified by an m x to matrix of Laurent polynomials over xi,X 2 ,---,Xd- 
Propositions PEI and 0 remain valid in the D-dimensional case: the proofs are 
analogous and use only the fact that Q\x,x~"^] is a commutative ring. 

However, factoring injective CA into elementary components is considerably 
harder in the higher dimensional spaces. The proof of Proposition 0 in the pre- 
vious section does not work if D > 2 because the division algorithm is no longer 
available. The proof was essentially the Euclidean algorithm where the degree of 
the polynomials on the first column was reduced using the division algorithm. 
In the higher dimensional case there is no natural notion of the degree of a poly- 
nomial, and the Euclidean algorithm cannot be used. Nevertheless, factorization 
is possible if the number of bands is at least three, but other techniques need to 
be used. 

The problem of factoring matrices of the special linear group SLm(D) into 
products of elementary matrices has been investigated for various commutative 
rings R. A classic result by Suslin m states that SLm(D) = Fim{R) if to > 3 
and R = k\x] is the ring of polynomials over a field k. In contrast, case to = 2 is 
different E): SL 2 (i?) yf E 2 (i?) for polynomial rings R = k[x]. Note that the results 
concern polynomials, not Laurent polynomials. But in H. Park introduced 
a technique to transform Laurent polynomials into non-Laurent polynomials in 
such a way that Suslin’s theorem can be extended to Laurent polynomials over 
a field. 

Because the quotient Q/M of a local ring Q and its unique maximal ideal 
M is a field. Park’s result proves that one can use elementary operations to 
reduce any A{x) € SLm(Q[*,^~^]), for to > 3 and local ring Q, into / -I- B{x), 
where I is the identity matrix and B(x) is a matrix of Laurent polynomials with 
coefficients in M. It is then straightforward to reduce I + B{x) into /. Because 
every finite commutative ring is a direct sum of local rings, we have 
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Proposition 5. Let R = Q[x,x where Q is a finite commutative ring. Then 
SLm{R) — Em{R) for every m > 3. 

The case m = 2 remains open. Cohn’s counter example 

f x^ a;y + 1 \ 

\xy-l J 

for polynomials over any field ^ is not a counter example for Laurent polyno- 
mials because the diagonal elements are invertible as Laurent polynomials. 

6 Conclusions 

We have introduced and investigated linear Cellular Automata with m state 
variables over a commutative ring Q. Especially we studied automata that are 
invertible and proved that, in the one-dimensional case, such automata can be 
factored into elementary components. In higher dimensional spaces with two 
bands it remains an outstanding open problem whether such a factorization 
exists. One should note that our factorization corresponds to so-called ladder 
decompositions in the theory of subband coding of signals 0. 

A motivation for this study came from potential applications to compression 
of binary images and signals. The idea, analogous to subband coding, is to divide 
the signal into odd and even samples and use a 2-band CA to pack most of the 
information into the even samples. The process can then be repeated on the 
even band, and iterated over several levels. Using zerotree coding PS| one can 
encode the odd samples on different levels into a compact and fully embedded 
representation of the original signal. 

In this work we restricted the study to linear CA. Unfortunately linearity is 
a constraint that severely limits the compression results one can hope to get. 
Non-linear CA allow more flexibility, but they are also much harder to ana- 
lyze. One possible next step would be to extend the factorization result to the 
non-linear case. It is straightforward to generalize the notion of an elementary 
operation to non-linear CA over m bands. A non-linear elementary operation 
changes variables of one band only. The variables are changed according to some 
permutations tt of Q. Which permutation tt is used in any given cell is deter- 
mined by the variables on the other m — 1 bands in the neighborhood. Such 
elementary step is trivially invertible because the inverse permutation re- 
stores the original states. It would be interesting to investigate which injective 
CA can be obtained by combining such non-linear elementary operations. Notice 
that each elementary step can be viewed as making a prediction for the value of 
a variable based on known values on the other m — 1 bands in the vicinity and 
storing the prediction error in the variable. If the prediction is good then the 
prediction error will have low information content, and will be compressable. 
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Abstract. We consider languages expressed by word equations in two 
variables and give a complete characterization for their complexity func- 
tions, that is, the functions that give the number of words of a given 
length. Specifically, we prove that there are only five types of complex- 
ities: constant, linear, exponential, and two in between constant and 
linear. For the latter two, we give precise characterizations in terms of 
the number of solutions of Diophantine equations of certain types. There 
are several consequences of our study. First, we show that the linear up- 
per bound on the non-exponential complexities by Karhumaki et al., cf. 
IRWI . is optimal. Second, we derive that both of the sets of all finite 
Sturmian words and of all finite Standard words are expressible by word 
equations. Third, we characterize the languages of non-exponential com- 
plexity which are expressible by two-variable word equations as finite 
unions of several simple parametric formulae and solutions of a two- 
variable word equation with a finite graph. Fourth, we find optimal up- 
per bounds on the solutions of (solvable) two-variable word equations, 
namely, linear bound for one variable and quadratric for the other. From 
this, we obtain an 0(n®) algorithm for testing the solvability of two- 
variable word equations. 

Keywords: word equation, expressible language, complexity function, 
minimal solution, solvability 



1 Introduction 

Word equations constitute one of the basic parts of combinatorics on words. The 
fundamental result in word equations is Makanin’s algorithm, cf. |Maj . which 
decides whether or not a word equation has a solution. The algorithm is one of 
the most complicated ones existing in the literature. The structure of solutions 
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of word equations is not well understood; see Pnil IKazlt IE az2| . A new light 
on that topic has been led recently by [mpi where the languages which are 
defined by solutions of word equations are studied. 

The structure of languages which are defined by equations with one variable is 
very simple. The infinite languages which are defined by one- variable word equa- 
tions consist of a finite part and an infinite part which is of the form A” A' for A' 
a prefix of A. The structure of the finite part is not completely known innu- 
Our analysis deals with languages which are defined by two- variable word equa- 
tions. We prove that the complexity of those languages, which is measured by 
the number of words of a given length, belongs to one of five classes: constant, 
I?i-type, I? 2 -type, linear and exponential. The complexities I?i-type and T> 2 - 
type are in between linear and constant and they are related to the number of 
solutions of certain Diophantine equations. As a side effect of our considerations 
we prove that the linear upper bound given in [KMP| for languages which do 
not contain a pattern language is optimal. An interesting related result is that 
the sets of Sturmian and Standard words are expressible by simple word equa- 
tions. As another consequence, we characterize the languages of non-exponential 
complexity which are expressible by two- variable word equations as finite unions 
of several simple parametric formulae and solutions of simple two- variable word 
equations. 

Based on our analysis, we find optimal upper bounds on the solutions of 
(solvable) two- variable word equations, namely, linear bound for one variable and 
quadratric for the other. From this, we obtain an 0{nP) algorithm for testing the 
solvability of two- variable word equations. We recall that the only polynomial- 
time algorithm known for this problem is the one given by Charatonik and 
Pacholski Its complexity, as computed in jChPaj . is 0(n^™). It should 

be added that they did not take very much care of the complexity. They mainly 
intended to prove that the problem can be solved in polynomial time. 

Due to space limitations we remove all proofs in particular several lemmas 
which are used to prove our main theorem Theorem 0 



2 Expressible Languages 

In this section we give basic definitions we need later on, as well as recalling 
some previous results. For an alphabet S, we denote by card(A') the number of 
elements of S; S* is the set of words over E with 1 the empty word. For w G E*, 
Iwl is the length of w; for a € E, jwla is the number of occurrences of a in w. 
By p(w) we denote the primitive root of w. If w = uv, then we denote u~^w = v 
and wv~^ = u. For any notions and results of combinatorics on words, we refer 
to and |ChKa) . 

Consider two disjoint alphabets, of constants, E, and of variables, S. A word 
equation e is a pair of words V' € (A* U S')*, denoted e : ip = tp. The size of e, 
denoted |e|, is the sum of the lengths of p and ip. The equation e is said to be 
reduced if p and ip start with different letters and end with different letters, as 
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words over SUS. Throughout the paper, all equations we consider are assumed 
to be reduced. 

A solution of e is a morphism h : (A U S')* ^ S* such that h(a) = a, for any 
a G S, and h{(p) = h{tp). The set of solutions of e is denoted by Sol(e). 

Notice that a solution can be given also as an ordered tuple of words, each 
component of the tuple corresponding to a variable of the equation. Therefore, 
we may take, for a variable X G S, the A-component of all solutions of e, that 
is. 



Ajv(e) = {a; G A* I there is a solution h of e such that h{X) = x}. 

The set Lx{e) is called the language expressed by X in e. X language L C S* is 
expressible if there is a word equation e and a variable X such that L = Lx{e). 
Notice that, if X does not appear in e, then Lx{e) = E* as soon as Sol(e) ^ 0. 
Also, if card(A') = 1, that is, there is only one constant letter, then all expressible 
languages are trivially regular, as we work here with numbers. Therefore, we shall 
assume that always card(A) > 2. The eomplexity function of a language L C A*, 
is the natural function f^L ■ N ^ N defined by #i(n) = card{w G L \ \w\ = n}. 

Example 1. Consider the equation e \ XX = Y . The complexity of its solutions 
with respect to Y is 

ii / \ _ / card(A)5, if n is even, 
ifw(e){n) = if n is odd. 

Since the function ^ can be very irregular, as can be seen from the above ex- 
ample, we use in our considerations a function which is defined by ffrin) = 
maxi<i<„ We say that a function / is constant if f(n) = 6>(1), is linear 

if /(n) = 6>(n), and is exponential if /(n) = 

We make the following conventions concerning notations: 

- a,b, . . . G E are constant letters, 

- A, B, ... G E* are (fixed) constant words, 

- X,Y, ... G S are variables, 

- x,y, . . . G E* may denote some arbitrary constant words but may also 
stand for images of variables by some morphisms from {E U S')* to E*, that is, 
X = h{X),y = h(y), etc., 

- (pj'ijj,. . . G {E U S)* are mixed words, which may (but need not) contain 
both constants and variables. 

We shall use also the following notation (due to Hmelevskii, cf. |Hmj ): for 
tti G {E U S)*, 1 < i < n, we denote = a\a 2 ■ ■ ■ otn- 

Example 2. For a fixed word A G A*, the language Li = {A" | n > 0} is 
expressed by the variable Y in the two-variable word equation e\ : XAY = 
AXX* where t is such that A = p(A)‘. Then = #Li is constant. 

We shall prove that the following result from pMpj gives an optimal bound 
on non-exponential complexity of languages expressible by two variable word 
equations. 
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Theorem 1. (Karhumaki et al. |KMP| i Any non- exponential complexity func- 
tion of languages expressible by two-variable word equations is at most linear. 




Xif = aij} 




X<P(aX\X) = V'(aJf|X) ^A\X) = a'0(l|JC) 



aip = b^p 



err 



(ii) 



(iii) 



Fig. 1. The graph associated with a word equation 



We shall need also the graph associated with an equation e : (p = tp, see |E3- It 
is constructed by applying exhaustively the so-called Levi’s lemma which states 
that if uv = wt, for some words u,v,w,t, then either u is a proper prefix of w 
or u = w or w is a proper prefix of u. The vertices of the graph are different 
equations (including e) and the directed edges are put as follows. We start with e 
and draw the graph by considering iteratively the following three cases, depicted 
in Fig. 0 (i) both sides of e start with variables (which are different since the 
equation is assumed to be reduced), (ii) one side starts with a constant and the 
other starts with a variable, and (iii) the two sides start with constants which are 
different. Clearly, in the last case the equation has no solution, which is marked 
by an error node. In Fig. QJ we denote by P(aj/3) the word obtained from p by 
replacing all occurrences of /3 by a. 

Thus, we start by processing e and then process all unprocessed vertices. 
When we find an equation already obtained, we do not create a new vertex but 
direct the corresponding edge to the old one. 

We notice that the graph associated with a word equation may be infinite 
but, if it is finite, then all solutions of the equation are obtained starting from 
a vertex with no outgoing edges and different from err and going in the oposite 
direction of the edges to the root; at the same time, the corresponding operations 
on the values of the variables are performed. 
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We recall that the Euler’s totient function (f : N —>■ N is defined by 

4>{n) = card{fc|/c is coprime with n} = n(l )(1 ) ... (1 ) 

Pi P2 Pk 

where pi,p 2 , ■ • • ,Pk are all the distinct prime factors of n. Clearly, the function 
ip(ji) = maxi<i<„ <f>{i) is linear. 

3 The Equation XbaY = YabX 

The starting point of our analysis is the equation 
eo : XbaY = YabX 

which we study in this section. We show first that there is a very close connection 
between solutions of Bq and the family of Standard words which we define below. 
Using then some strong properties of the Standard words, we prove that both 
functions #Lx{eo) #Ly(eo) are linear. 

Let us consider the set of solutions of our equation eo. For this we draw its 
associated graph in Fig. 0 




t t 

Y £ a* X Gb* 

Fig. 2. The graph of eo 

Consider the following two mappings 
01,02 : {a, ^}* X {a,b}* {a, 5}* x {a,b}* 
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defined by 

a\{u,v) = (u,ubav),a 2 {u,v) = (vabu,v), 

for any u,v G {a, 6}*. Using these two mappings, we give the following result 
which characterizes the set of solutions of Bq . 

Lemma 1. The solutions of Bq are precisely the pairs of words obtained by: 

(i) starting with a pair (u,v) of words in the set 

{(a"+\a") I n > 0} U {(5", 6"+^) | n > 0}, 

(a) applying to (u,v) a finite (possible empty) sequence . . . ,ai^, for 

some k > 0,1 < ij < 2, for any 1 < j < k. 

We define next the Standard words. The set TZ of Standard pairs (as defined 
by Rauzy, cf. |Haj ) is the minimal set included in {a,b}* x {a, 6}* such that: 

(i) (a, b) gTZ and 

(ii) TZ is closed under the two mappings 

/3i, /?2 : {a, b}* x {a, b}* {a, b}* x {a, b}* 

defined by /3 i(m,u) = (u,uv), l32{u,v) = {vu,v), for any u,v G {a,b}*. The set S 
of Standard words is defined by 

S = {u G {a, b}* I there is u G {a, b}* such that either {u, v) GTZ 
or (u, u) G TZ}. 

We shall use the following strong properties of Standard words, proved by de 
Luca and Mignosi, cf. IM. 

Lemma 2. (de Luca, Mignosi IdLMil 1 The set of Standard words satisfies the 
formula S = {a, 6} U 77{a6, ba} where 

n = {w G {a, b}* I w has two periods p, q which are coprime and 
|w| =p+q-2}. 



Lemma 3. (de Luca, Mignosi [dLMip ffn{n) = (j>{n + 2), for any n>0, where 
4> is Euler’s totient function. 

We now establish a connection between the solutions of Bq and the set of 
Standard pairs TZ. 

Lemma 4. Sol(eo) = {{u,v) \ (uba,vab) G TZ}. 



Theorem 2. 1. #Lx{eo)i'^) = #Lv(eo)(«) = <Kn), for any n>l. 

2. Lx{b.q) = Ly(eo) = n 
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As a corollary of Theorem |2] we obtain that the upper bound of Karhumaki 
et al. in Theorem ^ is optimal. 

Corollary 1. The linear upper bound for the non- exponential complexities of 
languages expressible by two-variable word equations is optimal. 

Furthermore, using the above considerations, we prove that both sets, of 
Standard and of Sturmian (finite) words are expressible by word equations. 

Example 3. Standard words. The set of Standard words S is expressed by the 
variable Z in the following system: 

f XbaY = YabX 

y Z = Xba or Z = Yab or Z = a or Z = b 

First, by Lemma El it is clear that the Z-components of all solutions of the 
system above are precisely the Standard words. Second, from the above system 
we can derive a single equation, as well known; see, e.g., [kMp| . Hence, the set 
of Standard words is expressible. 

Example 4- Sturmian words. There are many definitions of the finite Sturmian 
words (see, e.g., Hi and references therein). We use here only the fact that the 
set St of finite Sturmian words is the set of factors of 7T, cf. [dLMij . Therefore, 
the set St is expressed by the variable Z in the system 

f XbaY = YabX 
\X = WZT 

Again, this can be expressed using a single equation. 

4 Special Equations 

We study in this section equations in two variables such that the first from the 
left variable appearing in both sides of the equation is the same. We show that in 
this case there are three possible types of complexity for the language expressed 
by the other component: constant, exponential, and I?i-type. The last type lies 
in between constant and linear and is defined in terms of the number of solutions 
of certain Diophantine equations. 

Before giving the definition of the I?i-type, we give an example showing how 
it arises naturally. 

Example 5. Consider the equation 

e : aXXbY = XaYbX. 

Clearly, the set of solutions of e is Sol(e) = {(a", (a”6)'"a”) | n > 0}. Here 
#Lx(e) is constant but #Ly(e) is not. Indeed, for any p > 0, #Ly(e)(p) is the 
number of solutions of the Diophantine equation in unknowns n and m, (n + 
1)to -\- n = p. 
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We now define the I?i-type precisely. A function f : N N is oi divisor-type 
if there are some non-negative integers Ci, 1 < * < 4, such that ci > C3, C2 > C4 
and, /(fc) is the number of solutions of the Diophantine equation in unknowns 
n and m, (cin -I- C 2 )m c^n -|- C4 = fc. A function / is Vi-type if there is a 
divisor-type function g such that / = 0{g) where g{n) = maxi<i<„ g(z). 

Lemma 5. If e : ip = ip is an equation over S = {X,Y} such that the first 
variable appearing in each of ip and ip is X , then ^Lx(e) is constant and ^Ly(b) 
is either constant, or Vi-type, or else exponential. 

We now give some examples in order to see that all situations in Lemma 0 
are indeed possible. 

Example 6. (i) Consider first the equation ei : aXaY = XaYa. Then, clearly, 
both #Lx(ei) 3.nd #LY(ei) are constant. 

(ii) For the equation in Example El we have that fpLY(e) is T’l-type. 

(iii) Our last equation is 

62 : aXYXa = XaYaX. 

Then, clearly, Sol(e2) = {(a”,w) | n > 0,16 S E*}, hence #Ly.(e2) is exponential 
as soon as card(A) > 2. 

5 The General Form 

We start by defining I?2-type complexity functions. We say that a natural func- 
tion f : N ^ N is of divisor2-type if there are some integers Cj, 1 < z < 8, such 
that, f{k) is the number of solutions of the Diophantine equation in unknowns 
n, m, and p 

{{cin C2)m -\- czn 64)^ -I- {c^n ce)m ojn cs, = k. 

Note here that each divisor-type function is also divisor2-type function We say 
that a function / is of T> 2 -type complexity if there is a divisor2-type function g 
such that / = 0{g) where g{n) = maxi<i<„ g{i). 

We give next an example in which the I?2-type complexity is reached. 

Example 7. Consider the equation 

6 : XabcXcbabcY = YcbaXcbabcX, 

which we solve completely in what follows. Consider a solution (x,y) G Sol(e). 
Then xabcxcbabc and cbaxcbabcx are conjugated by y, that is, xabcxcbabc = 
cycle* (cbaxcbabcx) , for some 0 < t < 2\x\ 7. It is not difficult to see that the 

only possibilities for t are (i) t = 1, (ii) t = |a;|-|-4, and (iii) |a;|-|-9 < t < 2|a;|-|-7. 
(The cases t G {0, 2, 3, |a;| -I- 3} and |a;| -I- 5 < t < |a;| -I- 8 are immediately ruled 
out; the cases 4 < t < |a;| -I- 2, are ruled out using a reasoning which is in the 
proof of Theorem El) 
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In case (i) we have x = bab,y = {ba{babc)^)'^ba{babc)‘^bab,n > 0, and in case 
(ii) we obtain x = b,y = {ba{bc)^babc)'^babcb,n > 0. Thus, in both cases, we 
have a constant contribution to either of i^Lxie) ^fLyie)- 

The interesting case is (iii). Applying the reasoning in the proof of Theorem|3 
we obtain 

X = (6c)”6(a(6c)”'''"^5)'"a(6c)"5, y = {xabxaab)^uv{cba)~^ (1) 

for any n,m,p> 0 where u = (bc)^b and v = (a(6c)"“'"^6)™a. Hence, for any k > 
0, i^LY(e) differs by at most 2 from the number of solutions of the Diophantine 
equation (in unknowns n, m, and p) 

{{4:71 + 8)m + 8n + ll)p + {2n + 4)m + 2n — 1 = k. 

We have used also the fact that, if we denote y in l^) by ym,n,p, then (ni, mi,pi) ^ 
{n 2 ,rri 2 ,P 2 ) implies ym,nn,pi ^ 2/ri2,m2,P2- Consequently, is of T> 2 -type. 

We now study two- variable word equations of the general form. We show that 
one cannot obtain as complexities of their expressed languages anything but the 
five types we have identified so far, namely constant, I?i-type, I? 2 -type, linear, 
and exponential. 

Theorem 3. Let e be an equation with two variables X, Y . Then 

#Lx(e)i #Ly(e) ^ {constant, Vi-type, T> 2 -type, linear, exponential}. 

As another consequence of our study, we can give the general forms of the 
languages expressible by two-variable word equations. 

First, we need the notion of pattern language, from cf. also CSSI]- A 
pattern is a word over the alphabet A U S'. A pattern language generated by a 
pattern a, denoted L{a) is the set of all morphic images of a under morphisms 
h \ {E\J S)* — > S* satisfying h{a) = a, for any a G S. 

By Theorem 13 in [EHEl, we know that, for any language L which is ex- 
pressible by a two- variable word equation, if is exponential, then there exists 
a pattern a containing occurrences of one variable only such that L{a) C L. 

We have then the following theorem which characterizes the languages ex- 
pressible by two- variable word equations. 

Theorem 4. For any language L which is expressible by a two-variable word 
equation, we have 

(i) ’ll if L is exponential, then L contains a pattern language, 

(ii) ififr is not exponential, then L is a union of 

(a) a finite language, 

(b ) finitely many parametric formulae of the forms 

- A^B, 

- {A^B)'^A^C, 

- ([A"H,]ti)-pref([A"H,]ti), 

- and 

(c) solutions of an equation XAY = YBX; these solutions can be expressed 
as compositions of finite number of substitutions which can be computed on the 
basis of the graph for XAY = YBX . 
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6 Minimal Solutions and Solvability 

We consider here the lengths of solutions for two- variable word equations. 
Example 8. Consider the equation 
e : aXa^bX^ = XaXbY. 

The equation e has a unique solution which is {x, y) = (a", a" ). As |e| = 2n + 8, 
we have that |a;| = 0(|e|) and \y\ = 0(|ep). 

Using our analysis of two-variable word equations one can prove that the 
bounds for the lengths of words in solutions in Example |S| are optimal. 

Theorem 5. If e is a solvable two-variable word equation over S = {X,Y}, 
then e has a solution (x,y) such that |a:| < 2|e|, \y\ < 2|ep. 

Given a two- variable word equation e : ip = fj, and two non-negative numbers 
Ix^ly, it is clear that we can check in time |e| -I- Ipfjlxilx ~ 1) + — 1) 

whether e has a solution (x,y) for some x,y with |a;| = lx, \y\ = ly Therefore, 
we get immediately from Theorem E| the following result. 

Theorem 6. The solvability of two-variable word equations can be tested in time 
0{n^). 

Another consequence of Theorem concerns the complexity of languages 
expressible by three- variable word equations. The following result can be proved 
as Theorem 13 in UTMPI . 

Theorem 7. Let L be a language expressible by a three-variable word equation. 
Then either there is a one-variable pattern a such that L(a) C L or #i(n) = 
0{n^). 
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Abstract. We compare classical and quantum query complexities of to- 
tal Boolean functions. It is known that for worst-case complexity, the gap 
between quantum and classical can be at most polynomial P|. We show 
that for average-case complexity under the uniform distribution, quan- 
tum algorithms can be exponentially faster than classical algorithms. 
Under non-uniform distributions the gap can even be super-exponential. 
We also prove some general bounds for average-case complexity and show 
that the average-case quantum complexity of MAJORITY under the uni- 
form distribution is nearly quadratically better than the classical com- 
plexity. 



1 Introduction 

The field of quantum computation studies the power of computers based on quan- 
tum mechanical principles. So far, most quantum algorithms — and all physically 
implemented ones — have operated in the so-called black-box setting. Examples 
are 1911811117181; even period-finding, which is the core of Shor’s factoring algo- 
rithm HZI, can be viewed as a black-box problem. Here the input of the function 
/ that we want to compute can only be accessed by means of queries to a “black- 
box”. This returns the ith bit of the input when queried on i. The complexity 
of computing / is measured by the required number of queries. In this setting 
we want quantum algorithm that use significantly fewer queries than the best 
classical algorithms. 

We restrict attention to computing total Boolean functions f on N vari- 
ables. The query complexity of / depends on the kind of errors one allows. 
For example, we can distinguish between exact computation, zero-error com- 
putation (a.k.a. Las Vegas), and bounded-error computation (Monte Carlo). In 
each of these models, worst-case complexity is usually considered: the complex- 
ity is the number of queries required for the “hardest” input. Let D{f), R{f) 
and Q{f) denote the worst-case query complexity of computing / for classical 
deterministic algorithms, classical randomized bounded-error algorithms, and 
quantum bounded-error algorithms, respectively. Clearly Q{f) < R{f) < D{f). 

* Part of this work was done when visiting Microsoft Research. 
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The main quantum success here is Grover’s algorithm im. It can compute the 
OR- function with bounded-error using 0{\fN) queries (this is optimal 
Thus Q(OR') € 0{'/N), whereas Z3(OR) = N and i?(OR) G 0{N). This is 
the biggest gap known between quantum and classical worst-case complexities 
for total functions. (In contrast, for partial Boolean functions the gap can be 
much bigger Mm .) A recent result is that the gap between D{f) and Q{f) is at 
most polynomial for every total /: D{f) G 0(Q(/)®) 0. This is similar to the 
best-known relation between classical deterministic and randomized algorithms: 
D(f) G o(i?(/)3) CH]. 

Given some probability distribution n on the set of inputs {0, 1}^ one may 
also consider average-ease complexity instead of worst-case complexity. Average- 
case complexity concerns the expected number of queries needed when the input 
is distributed according to /i. If the hard inputs receive little /r-probability, then 
average-case complexity can be significantly smaller than worst-case complexity. 
Let D^{f), R^{f), and Q^{f) denote the average-case analogues of L?(/), R{f), 
and Q(/), respectively. Again Q^{f) < R^{f) < The objective of this 

paper is to compare these measures and to investigate the possible gaps between 
them. Our main results are: 

— Under uniform /r, Q^{f) and R^{f) can be super-exponentially smaller than 

— Under uniform /r, Q^{f) can be exponentially smaller than R^{f). Thus 
the |2|-result for worst-case quantum complexity does not carry over to the 
average-case setting. 

— Under non-uniform /i the gap can be even larger: we give distributions p, 
where Q^(OR) is constant, whereas i?^(OR) is almost ^/N . (Both this gap 
and the previous one still remains if we require the quantum algorithm to 
work with zero-error instead of bounded-error.) 

— For every / and p, R^{f) is lower bounded by the expected block sensitivity 
Ef_i[bs{f)] and Q^{f) is lower bounded by E|^[^ybs{f)]. 

— For the MAJORITY-function under uniform p, we have Q^{f) G 0(A^^/^“*'^) 
for every e > 0, and Q^{f) G I7(A^^/^). In contrast, R^{f) G f2{N). 

— For the PARITY-function, the gap between and R^ can be quadratic, 
but not more. Under uniform p, PARITY has Q^{f) G f2{N). 

2 Definitions 

Let / : {0,1}^ ^ {Oj 1} be ^ Boolean function. It is symmetric if f{X) only 
depends on |A|, the Hamming weight (number of Is) of A. 0 denotes the input 
with weight 0. We will in particular consider the following functions: OR(A) = 1 
iff |A| > 1; MAJ(A) = 1 iff |A| > A/2; PARITY(A) = 1 iff |A| is odd. If 
X G {0, 1}-^ is an input and S a set of (indices of) variables, we use X^ to 
denote the input obtained by flipping the values of the S'-variables in X . The 
block sensitivity bsx{f) of / on input X is the maximal number b for which 
there are b disjoint sets of variables Si, . . . , St such that f{X) ^ f{X^') for all 
1 < i < b. The block sensitivity bs{f) of / is maxx bsx{f)- 
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We focus on three kinds of algorithms for computing /: classical determinis- 
tic, classical randomized bounded-error, and quantum bounded-error algorithms. 
If A is an algorithm (quantum or classical) and b G {0, 1}, we use Pr[A(Ai) = b] 
to denote the probability that A answers b on input X. We use Ta{X) for 
the expected number of queries that A uses on input xQ Note that this only 
depends on A and X, not on the input distribution fj,. For deterministic A, 
Pr[A(AT) = 6] G {0, 1} and the expected number of queries Ta{X) is the same 
as the actual number of queries. 

Let T>(f) denote the set of classical deterministic algorithms that compute 
/. Let nlf) = {classical A | VX G {0,1}^ : Pr[A(X) = f{X)] > 2/3} be 
the set of classical randomized algorithms that compute / with bounded error 
probability. Similarly let Q{f) be the set of quantum algorithms that compute 
/ with bounded-error. We define the following worst-case complexities: 



Dif) 



min max Ta(X) 
AGV{f)X ^{0,1}^ 



R{f) 



min max Ta(X) 
AG'R{f)XG{0A}^ 



Qif) 



min max Ta(X) 

^ee(/)XG{o,i}« 



D{f) is also known as the decision tree complexity of / and R{f) as the bounded- 
error decision tree complexity of /. Since quantum generalizes randomized and 
randomized generalizes deterministic computation, we have Q{f) < R{f) < 
D{f) for all /. The three worst-case complexities are polynomially related: 
D{f) G 0{R{ff) HE] and D{f) G 0{Q{ff) 0 for all total /. 

Let p : {0, 1}^ ^ [0, 1] be a probability distribution. We define the average- 
case complexity of an algorithm A with respect to a distribution p as: 



T^= Y. l^iX)TA{X). 

XG{0,1}« 



The average-case deterministic, randomized, and quantum complexities of / with 
respect to p are 

= min 

i?^(/)= min n 
Aen{f) 

Q^(/)= min 

Note that the algorithms still have to output the correct answer on all inputs, 
even on X that have p{X) = 0. Clearly Q^{f) < R^{f) < for all p and 

^ See 0 for definitions and references for the quantum circuit model. A satisfactory 
formal definition of expected number of queries Ta{X) for a quantum algorithm A is 
a hairy issue, involving the notion of a stopping criterion. We will not give such a 
definition here, since in the bounded-error case, expected and worst-case number of 
queries can be made the same up to a small constant factor. 
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/. Our goal is to examine how large the gaps between these measures can be, in 
particular for the uniform distribution unif{X) = 2“^. 

The above treatment of average-case complexity is the standard one used 
in average-case analysis of algorithms m- One counter-intuitive consequence 
of these definitions, however, is that the average-case performance of polynomi- 
ally related algorithms can be superpolynomially apart (we will see this happen 
in Section EJ. This seemingly paradoxical effect makes these definitions unsuit- 
able for dealing with polynomial-time reducibilities and average-case complexity 
classes, which is what led Levin to his alternative definition of “polynomial time 
on average” mB Nevertheless, we feel the above definitions are the appropri- 
ate ones for our query complexity setting: they just are the average number of 
queries that one needs when the input is drawn according to distribution /i. 



3 Super-Exponential Gap between and 

Here we show that can be much larger then and 

Theorem 1. Define f on N variables such that f{X) = 1 iff |X| > A^/10. Then 
Qum/(/) are 0(1) and 0“™/(/) G 0(Af). 

Proof. Suppose we randomly sample k bits of the input. Let a = \X\/N denote 
the fraction of Is in the input and a the fraction of Is in the sample. Standard 
Chernoff bounds imply that there is a constant c > 0 such that 

Pr[h < 2/10 I a > 3/10] < 2"“'=. 

Now consider the following randomized algorithm for /: 

1. Let i = 1. 

2. Sample ki = i/c bits. If the fraction di of Is is > 2/10, output 1 and stop. 

3. If z < log A^, increase z by 1 and repeat step 2. 

4. If z > log N, count N exactly using N queries and output the correct answer. 

It is easily seen that this is a bounded-error algorithm for /. Let us bound its 
average-case complexity under the uniform distribution. 

If a > 3/10, the expected number of queries for step 2 is 

log N 

^ Pr[hi < 2/10, . . . , a,_i < 2/10 | a > 3/10] • - < 



\ogN . log N 

^ Pr[a,_i < 2/10 1 a > 3/10] • - < ^ ■ - G 0(1). 

i=i i=i ^ 

The probability that step 4 is needed (given a > 3/10) is at most _ 

1/N. This adds = 1 to the expected number of queries. 

^ We thank Umesh Vazirani for drawing our attention to this. 
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The probability of a < 3/10 is 2“'^^ for some constant c'. This case con- 
tributes at most 2~‘^ ^ {N + (logiV)^) £ o(l) to the expected number of queries. 
Thus in total the algorithm uses 0(1) queries on average, hence £ 0(1). 

It is easy to see that any deterministic classical algorithm for / must make 
at least A^/10 queries on every input, hence £)““/(/) > A^/10. □ 

Accordingly, we can have huge gaps between T)“"*f(/) and Q“"*^(/). However, 
this example tells us nothing about the gaps between quantum and classical 
bounded-error algorithms. In the next section we exhibit an / where Q'“™-^(/) is 
exponentially smaller than i?“"®-f(/). 

4 Exponential Gap between and 

4.1 The Function 

We use the following modification of Simon’s problem mE 
Input: X = (a;i, . . . ,X 2 ^), where each Xi £ {0, 1}". 

Output: f{X) = 1 iff there is a non-zero k £ {0, 1}" such that Xit^k = Xi Vi. 

Here we treat i £ {0,1}" both as an n-bit string and as a number, and 0 
denotes bitwise XOR. Note that this function is total (unlike Simon’s). Formally, 
/ is not a Boolean function because the variables are {0, l|"-valued. However, 
we can replace every variable Xi by n Boolean variables and then / becomes a 
Boolean function of = n2" variables. The number of queries needed to com- 
pute the Boolean function is at least the number of queries needed to compute 
the function with {0, l|"-valued variables (because we can simulate a query to 
the Boolean oracle with a query to the {0, l}"-valued oracle by just throwing 
away the rest of the information) and at most n times the number of queries 
to the {0, l|"-valued oracle (because one {0, l}"-valued query can be simulated 
using n Boolean queries). As the numbers of queries are so closely related, it 
does not make a big difference whether we use the (0, l}"-valued oracle or the 
Boolean oracle. For simplicity we count queries to the {0, l}"-valued oracle. 
The main result is the following exponential gap: 

Theorem 2. For / as above, < 22n + 1 and i?“™-^(/) £ l?(2"/2). 



4.2 Quantum Upper Bound 

The quantum algorithm is similar to Simon’s. Start with the 2-register super- 
position X)iG{o i}" l*)|0) convenience we ignore normalizing factors). Apply 
the oracle once to obtain 

\i)\xi). 

IG{0,1}" 



® The recent preprint na proves a related but incomparable result about another 
modification of Simon’s problem. 
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Measuring the second register gives some j and collapses the first register to 



Applying a Hadamard transform H to each qubit of the first register gives 



(o,6) denotes inner product mod 2; if (o,6) = 0 we say a and h are orthogonal. 

If f{X) = 1, then there is a non-zero k such that Xi = Xn^k for all i. In 
particular, xi = j iff xi^k = j- Then the final state |[Q) can be rewritten as 



Notice that \i') has non-zero amplitude only if {k,i') = 0. Hence if f{X) = 1, 
then measuring the final state gives some i' orthogonal to the unknown k. 

To decide if f{X) = 1, we repeat the above process m = 22n times. Let 
G {0,1}” be the results of the m measurements. If f{X) = 1, there 
must be a non-zero k that is orthogonal to all A. Compute the subspace S C 
{0, 1}” that is generated hy ii, ... ,im (i-e. S is the set of binary vectors obtained 
by taking linear combinations of ii, . . . ,im over GF{2)). If S' = (0, 1}”, then the 
only k that is orthogonal to all A is fc = 0", so then we know that f{X) = 0. If 
S yf (0, 1}”, we just query all 2" values xo,,.o, • • ■ , and then compute f{X). 
This latter step is of course very expensive, but it is needed only rarely: 

Lemma 1. Assume that X = (a;o...o, ■ • ■ , a^i...i) chosen uniformly at random 
from jo, 1}^. Then, with probability at least 1 — 2“", f{X) = 0 and the measured 
ii, . . . ,im generate {0, 1}". 

Proof. It can be shown by a small modification of Theorem 5.1, p.91] that 
with probability at least 1 — 2“"^^ (c > 0), there are at least 2”/8 values j such 
that Xi = j for exactly one i G {0, 1}”. We assume that this is the case. 

If ii, ..., im generate a proper subspace of (0, 1}”, then there is a non-zero 
A: e jo, 1}” that is orthogonal to this subspace. We estimate the probability that 
this happens. Consider some fixed non-zero vector k G {0, 1}”. The probability 
that ii and k are orthogonal is at most j|, as follows. With probability at least 
1/8, the measurement of the second register gives j such that f{i) = j for a 
unique i. In this case, the measurement of the final superposition © gives a 
uniformly random i' . The probability that a uniformly random i' has (k,i') 0 

is 1/2. Therefore, the probability that {k,ii) = 0 is at most 1 ~ I • | = 





i'e{o,i} 



( 1 ) 



i'G{0,l}" i-.Xi=j 
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The vectors ... ,im are chosen independently. Therefore, the probability 
that k is orthogonal to each of them is at most < 2“^”. There are 2" — 1 

possible non-zero k, so the probability that there is a fc which is orthogonal to 
each of zi, . . . , im, is at most (2" — 1)2“^" < 2“". □ 

Note that this algorithm is actually a zero-error algorithm: it always outputs 
the correct answer. Its expected number of queries on a uniformly random input 
is at most m = 22n for generating ii, . . . and at most ^2" = 1 for querying 
all the Xi if the first step does not give that generate {0, 1}". This 

completes the proof of the first part of Theorem O 



4.3 Classical Lower Bound 

Let Di be the uniform distribution over all inputs X G {0, 1}^ and D 2 be the 
uniform distribution over all X for which there is a unique k ^ 0 such that 
Xi = Xi^k (and hence f{X) = 1). We say an algorithm A distinguishes between 
Di and D 2 if the average probability that A outputs 0 is >3/4 under D\ and 
the average probability that A outputs 1 is > 3/4 under D 2 . 

Lemma 2. If there is a hounded- error algorithm A that computes f with m = 
2 / 4 "*'^ queries on average, then there is an algorithm that distinguishes between 
D\ and D 2 and uses 0(m) queries on all inputs. 

Proof. We run A until it stops or makes 4 to queries. The average probability 
(under Di) that it stops is at least 3/4, for otherwise the average number of 
queries would be more than j(4m) = m. Under D\, the probability that A 
outputs f{X) = 1 is at most 1/4 -|- o(l) (1/4 is the maximum probability of 
error on an input with f{X) = 0 and o(l) is the probability of getting an input 
with f{X) = 1). Therefore, the probability under Di that A outputs 0 after at 
most 4m queries, is at least 3/4 — (1/4-1- o(l)) = 1/2 — o(l). 

In contrast, the ^ 2 -probability that A outputs 0 is < 1/4 because f{X) = 1 
for any input X from D 2 . We can use this to distinguish Di from D 2 . □ 



Lemma 3. No classical randomized algorithm A that makes m G o(2"/^) queries 
can distinguish between D\ and D 2 . 

Proof. For a random input from Di, the probability that all answers to m queries 
are different is 

1 • (1 - 1/2") • • • (1 - (m - l)/2") > (1 - m/2")’" ^ = 1 _ 0 ( 1 ). 

For a random input from D 2 , the probability that there is an z s.t. A queries 
both Xi and Xi(^k {k is the hidden vector) is < (™)/(2" — 1) G o(l), since: 

1. for every pair of distinct z, j, the probability that z = j © A: is 1/(2" — 1) 

2. since A queries only m of the Xi, it queries only distinct pairs i,j 
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If no pair Xi, Xi^k is queried, the probability that all answers are different is 

1 • (1 - l/2"-i) • • • (1 - (m - l)/2"-i) = 1 - o(l). 

It is easy to see that all sequences of m different answers are equally likely. 
Therefore, for both distributions D\ and D 2 , we get a uniformly random sequence 
of m different values with probability 1— o(l) and something else with probability 
0 ( 1 ). Thus A cannot “see” the difference between Di and D 2 with sufficient 
probability to distinguish between them. □ 

The second part of Theorem now follows: a classical algorithm that com- 
putes / with an average number of m queries can be used to distinguish between 
Di and D 2 with 0{m) queries (Lemma^l, but then 0{m) S (Lemma|^. 

5 Super-Exponential Gap for Non-uniform /j, 

The last section gave an exponential gap between and under uniform 
Here we show that the gap can be even larger for non-uniform /r. Consider the 
average-case complexity of the OR- function. It is easy to see that Z3“"*^(OR), 
(OR), and Q“"*-^(OR) are all 0(1), since the average input will have many 
Is under the uniform distribution. Now we give some examples of non-uniform 
distributions n where Q^(OR) is super-exponentially smaller than i?'^(OR): 

Theorem 3. If a G (0,1/2) and fj,{X) = c/(|^|)(|X|-bl)“(fV-bl)i-“ (a=il-a 
is a normalizing constant), then i?^(OR) G 0{N°‘) and Q^(OR) G 0(1). 

Proof. Any classical algorithm for OR requires 0(A^/(|A| -1-1)) queries on input 
X. The upper bound follows from random sampling, the lower bound from a 
block-sensitivity argument uni- Hence (omitting the intermediate 0s): 

N 

]\f rN^ 

R-(OR) = X = g e(iv-). 

Similarly, for a quantum algorithm 0{\J N / {\X \ -|- 1) queries are necessary and 
sufficient on input X mEi , so 

I AT ^ 

O'-(OR) = = g yyiyTTTJ e(i). 

In particular, for a = 1/2 — e we have the huge gap 0(1) quantum versus 
j 7 ( 7 Vi/ 2 -e) classical. Note that we obtain this super-exponential gap by weighing 
the complexity of two algorithms (classical and quantum OR-algorithms) which 
are only quadratically apart on each input X. 

In fact, a small modification of /i gives the same big gap even if the quantum 
algorithm is forced to output the correct answer always. We omit the details. 
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6 General Bounds for Average-Case Complexity 

In this section we prove some general bounds. First we make precise the intu- 
itively obvious fact that if an algorithm A is faster on every input than another 
algorithm B, then it is also much faster on average under any distribution: 

Theorem 4 . If (j) : II. ^ H. is a concave function and Ta{X) < (I){Tb{X)) for 
all X, then < 4> (T^) for every 

Proof. By Jensen’s inequality, if (j) is concave then E^[4>{T)\ < 4>{E^\T])^ hence 
n< E t^miTsiX)) < I Y. KX)Tb{X)\ □ 

XG{0,1}« \XG{0,1}» / 

In words: taking the average cannot make the complexity-gap between two 
algorithms smaller. For instance, if Ta(X) < ^JTb{X) (say, A is Grover’s algo- 
rithm and S is a classical algorithm for OR), then < On the other 

hand, taking the average can make the gap much larger, as we saw in Theo- 

rem |3 the quantum algorithm for OR runs only quadratically faster than any 
classical algorithm on each input, but the average-case gap between quantum 
and classical can be much bigger than quadratic. 

We now prove a general lower bound on and . Using an argument 
from m for the classical case and an argument from |2j for the quantum case, 
we can show: 

Lemma 4 . Let A he a bounded-error algorithm for some function f . If A is clas- 
sical thenTA{X) G n{bsx{f)), and if A is quantum then Ta{X) G ^2{^/bsxJJ))■ 

A lower bound in terms of the ^-expected block sensitivity follows: 

Theorem 5. For all f, Ri^{f)Gl2{E^[bsx{f)]) and Q'"(/) Gf2{E^[^/bsx{f)]). 

7 Average-Case Complexity of MAJORITY 

Here we examine the average-case complexity of the MAJORITY-function. The 
hard inputs for majority occur when t = |A1| N/2. Any quantum algorithm 

needs 12 (N) queries for such inputs jSj. Since the uniform distribution puts most 
probability on the set of X with |Y| close to N/2, we might expect an f2{N) 
average-case complexity. However we will prove that the complexity is nearly 
\/N. For this we need the following result about approximate quantum counting, 
which follows from |Hl Theorem 5] (see also or |T7tI Theorem 1.10]): 

Theorem 6 (Brassard, Hpyer, Tapp; Mosca). Let a G [0,1]. There is a 
quantum algorithm with worst-case 0{N°‘) queries that outputs an estimate t of 
the weight t = jJfj of its input, such that ]t — t] < with probability > 2/3. 



Theorem 7. For every e > 0, Q“™^(MAJ) G 



142 



Andris Ambainis and Ronald de Wolf 



Proof. Consider the following algorithm, with input X, and a G [0, 1] to be 
determined later. 

1. Estimate t = |X| by f using 0{N°‘) queries. 

2. If t < N/2 — N^~°‘ then output 0; if t > N/2 + then output 1. 

3. Otherwise use N queries to classically count t and output its majority. 

It is easy to see that this is a bounded-error algorithm for MAJ. We determine 
its average complexity. The third step of the algorithm will be invoked iff |t — 
N/2\ < Denote this event by “i « N/2'" . For 0 < fc < N°‘/2, let Dk 

denote the event that kN^~°‘ < \t — N/2\ < {k + Under the uniform 

distribution the probability that |A| = t is By Stirling’s formula this 

is 0(l/VfV), so the probability of the event Dk is “). In the quantum 

counting algorithm, Pr[kN^~°‘ < |t — t| < (fc -|- G 0(l/(k+ 1)) (this 

follows from |^, the upcoming journal version of 0 and d)- Hence also Pr[t « 
N/2 I Dk] G 0{l/{k + 1)). The probability that the second counting stage is 
needed is Pr[t s:! N/2], which we bound by 

Af“/2 AT“/2 

^ Vr[i^N/2 I Dfc]-Pr[Dfe] = ^ O(^) .0(iyi/2-«) = 0{N^/^~‘^\ogN). 

fc— 0 k—0 

Thus we can bound the average-case query complexity of our algorithm by 

0(Af“) -b Pr[t « N/2] ■ N = OiN^) + logN). 

Choosing a = 3/4, we obtain an 0{N^^‘^ log A^) algorithm. 

However, we can reiterate this scheme: instead of using N queries in step 3 
we could count using 0{N°‘^) instead of N queries, output an answer if there is 
a clear majority (i.e. ]t — N/2] > N^~°‘‘^), otherwise count again using 0{N°‘^) 
queries etc. If after k stages we still have no clear majority, we count using N 
queries. For any fixed k, we can make the error probability of each stage suffi- 
ciently small using only a constant number of repetitions. This gives a bounded- 
error algorithm for MAJORITY. (The above algorithm is the case k = 1.) 

It remains to bound the complexity of the algorithm by choosing appropriate 
values for k and for the ai (put ai = a). Let pi denote the probability under 
unif that the Ah counting-stage will be needed, i.e. that all previous counts gave 
results close to N/2. Then G 0(A^^/^““Mog A^) (as above). The average 
query complexity is now bounded by: 

O(A^“0 +P2 • 0(A^“^) + • • • + Pfc • 0(A^“'=) +pk+i ■ N = 

0(Af“i>fO(Af^/^"“i+“^ log A^>f • • log Af)fO(Af^/^"“^ log A^). 

Clearly the asymptotically minimal complexity is achieved when all exponents 
in this expression are equal. This induces k — 1 equations Oi = 1/2 — -|- ai+i, 

1 < i < k, and a kth equation a\ = 3/2 — ak. Adding up these k equations we 
obtain ka\ = — o;i-|-(fc — 1)/2-|-3/2, which implies ai = 1/2 + 1/ {2k+2). Thus we 
have average query complexity 0 (A^i/ 2 +i/( 2 fe+ 2 ) log A^). Choosing k sufficiently 
large, this becomes 0(A^^/^+^). □ 
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The nearly matching lower bound is: 

Theorem 8. Q“”/(MAJ) G 

Proof. Let A be a bounded-error quantum algorithm for MAJORITY. It follows 
from the worst-case results of 0 that A uses n{N) queries on the hardest 
inputs, which are the X with |Y| = N/2 ± 1. Since the uniform distribution 
puts n{l/^/N) probability on the set of such X, the average-case complexity of 
A is at least = Q{\fN). □ 

What about the classical average-case complexity? Alonso, Reingold, and 
Schott |2j prove that U“™'^(MAJ) = 2iV/3 — -^/SN/Ott + O(logiV). We can also 
prove that i?“™-^(MAJ) G f2{N) (for reasons of space we omit the details), so 
quantum is almost quadratically better than classical for this problem. 

8 Average-Case Complexity of PARITY 

Finally we prove some results for the average-case complexity of PARITY. This 
is in many ways the hardest Boolean function. Firstly, bsx{f) = X for all X, 
hence by Theorem 0 

Corollary 1. For every fi, R'^(PARITY) G l7(iV) and Q'" (PARITY) G 17(\/]V). 

We can bounded-error quantum count |Y| exactly, using 0{yJ{\X\ -|- 1)A^) 
queries 0. Combining this with a that puts 0{1/'/N) probability on the set 
of all X with |Y| > 1, we obtain Q^(PARITY) G 0{\/N). 

We can prove (PARITY) < 7V/6 for any /r by the following algorithm: with 
probability 1/3 output 1, with probability 1/3 output 0, and with probability 1/3 
run the exact quantum algorithm for PARITY, which has worst-case complexity 
N/2 flTimj . This algorithm has success probability 2/3 on every input and has 
expected number of queries equal to iV/6. 

More than a linear speed-up on average is not possible if /r is uniform: 

Theorem 9. g“”/(PARITY) G Q{N). 

Proof. Let A be a bounded-error quantum algorithm for PARITY. Let B be 
an algorithm that flips each bit of its input X with probability 1/2, records 
the number b of actual bitflips, runs A on the changed input Y, and outputs 
A{Y) 0 b. It is easy to see that R is a bounded-error algorithm for PARITY and 
that it uses an expected number of T// queries on every input. Using standard 
techniques, we can turn this into an algorithm for PARITY with worst-case 
0{T/f) queries. Since the worst-case lower bound for PARITY is N/2 1311 01 . the 
theorem follows. □ 
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Abstract. In this paper, lower bound and tradeoff results relating the 
computational power of determinism, nondeterminism, and randomness 
for communication protocols and branching programs are presented. The 
main results can be divided into the following three groups. 

(i) One of the few major open problems concerning nondeterminis- 
tic communication complexity is to prove an asymptotically exact 
tradeoff between complexity and the number of available advice bits. 
This problem is solved here for the case of one-way communication. 

(ii) Multipartition protocols are introduced as a new type of communica- 
tion protocols using a restricted form of non-obliviousness. In order 
to be able to study methods for proving lower bounds on multilec- 
tive and/or non-oblivious computation, these protocols are allowed 
to either deterministically or nondeterministically choose between 
different partitions of the input. Here, the first results showing the 
potential increase of the computational power by non-obliviousness 
as well as boundaries on this power are derived. 

(iii) The above results (and others) are applied to obtain several new 
exponential lower bounds for different types of oblivious branching 
programs, which also yields new insights into the power of nonde- 
terminism and randomness for the considered models. The proofs 
rely on a general technique described here which allows to prove 
explicit lower bounds on the size of oblivious branching programs 
in an easy and transparent way. 



1 Introduction and Definitions 

The communication complexity of two-party protocols has been introduced by 
Abelson P and Yao m- The initial goal was to develop a method for proving 
lower bounds on the complexity of distributed and parallel computations. 

* This work has been supported by DFG grants HR 14/3-2 and We 1066/8-2. 
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Let /: {0, 1}" — > {0, 1} be a Boolean function defined on a set X of n Boolean 
variables, and let 77 = (Tfi, X 2 ) {Xi U X 2 = X , Xi D X 2 = 0) be a partition of 
X. A (communication) protocol P computing f according to 77 consists of two 
computers Ci and Cu with unbounded computational power. At the beginning 
of the computation, Ci obtains an input x: X\ {0,1} and Cu obtains an 
input y: X 2 — > {0, 1}. Then Ci and Cu communicate according to the protocol 
by exchanging binary coded messages until one of them knows the result f{x, y). 
The cost of the computation of P on an input (a;, y) is the sum of the lengths 
of exchanged messages. The cost of the protocol P, cc(P), is the maximum of 
the cost over all inputs x,y. The communication complexity of f according to 77, 
cc(/, 77), is the cost of the best protocol computing / according to 77. There are 
several ways to define the communication complexity of /, the choice depend- 
ing on the application considered. Usually, the communication complexity of /, 
cc(/), is defined as the minimum of cc(/, 77) over all balanced partitions of the 
set of input variables. Analogously, ncc(/) stands for the nondeterministic com- 
munication complexity of f. Finally, for a communication complexity measure 
x{f) used in this paper, Xk{f) denotes the fc-round version of x{f), and nx’'(/) 
denotes the corresponding nondeterministic complexity with the bound r on the 
number of advice bits. 

In the two decades of its existence, communication complexity has established 
itself as a well defined subarea of complexity theory (see for a thorough 

introduction) . The reason for this is the success in the following two main streams 
of research: 

I. Comparison of the power of different modes of computation. Communication 
protocols are one of the few models of computation where the relative power 
of deterministic, nondeterministic, and randomized computation could be 
characterized, leading to a better understanding of these modes of computa- 
tion. 

II. Proving lower bounds. Communication complexity has considerably con- 
tributed to proving lower bounds on the amount of resources required to 
solve concrete problems in several fundamental sequential and parallel mod- 
els of computation, e. g., circuits, Turing machines, and branching programs. 

In this paper, we contribute to both of these streams of research. In the first 
part of the paper, new results concerning the power of nondeterminism for com- 
munication protocols are presented. The second part deals with the application 
these results (and others) to prove lower bounds for branching programs. 

Communication with Restricted Nondeterminism. With respect to 
the first point from above, we may not only be interested in the question whether 
nondeterminism or randomness helps at all to compute a given function, but we 
may also ask the following, more sophisticated questions: 

- How many advice or random bits are needed to achieve the full power of the 
respective model of computation? 

^ We always talk about private nondeterminism, where advice bits have to be explicitly 
communicated if the other computer has to know them. In an unrestricted public 
model, each function would have complexity at most 1 (see [S]). 
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- Can all nondeterministic or random guesses be moved to the very beginning of 
the computation without increasing the complexity and the required number 
of advice or random bits, resp.? 

For communication protocols and the resource randomness, the following an- 
swers to these questions have been given. Newman has proven that at most 
0(log n) random bits evaluated at the beginning of the computation are suf- 
ficient to get the full power out of randomized protocols with bounded error 
or zero-error. Canetti and Goldreich p| and Fleischer, Jung, and Mehlhorn CH 
have proven lower bounds on the number of required random bits in terms of the 
communication complexity for various models. These bounds are asymptotically 
optimal. 

The dependence of the nondeterministic communication complexity of two- 
way protocols on the available number of advice bits has been analyzed by 
Hromkovic and Schnitger m- They have shown that for every fixed number of 
advice bits r(n) = O(log'^n), c > 1 an arbitrary constant, there is a function 
fr{n) ■ {Oj 1}^" ^ {0, 1} which has nondeterministic communication complex- 
ity 0(log‘^n) if at least r{n) advice bits are available; but f2(n/logn), if only 
o{r{n) / log n) advice bits may be used. Up to now, it is open to prove an asymp- 
totically exact tradeoff between nondeterministic communication complexity and 
the number of advice bits. 

We present a partial solution to this problem by proving such a tradeoff for 
one-way communication, where the second computer has to output the result 
after receiving a single message from the first computer. We prove that the 
conjunction of s copies of the well-known “pointer” or “index” function IND„ has 
nondeterministic one-way complexity 0{sn ■ 2“”/® -|- r), where r is the number 
of advice bits (the input size is 0{s{n logn))). 

As our second main contribution, we introduce and investigate a new type of 
communication protocols. Usual protocols are oblivious in the sense that they 
work only with a fixed partition of the input variables. Multipartition protocols 
introduced here may work with different partitions depending on the input and 
allow a controlled degree of non-obliviousness. In order to study the dependence 
of the complexity on the available amount of nondeterminism, we allow nonde- 
terministic multipartition protocols to guess a partition from a given collection. 

Definition 1. Let k be a positive integer, and let f be a Boolean function defined 
on a set X of input variables. A (deterministic) fc-partition protocol P for / con- 
sists ofk-\-l two-party communication protocols {Pq, LIq), {Pi, IIi), . . . , (Pk, Ilk), 
where II i is a balanced partition for i = 0,1,. ..,fc. For an arbitrary input 
x: X ^ {0, 1}, the protocol P works as follows. 

(i) The protocol {Po,IIo) computes a value from {1,2,..., A:} for the input x 
partitioned according to TJq. 

(a) If Pq{x) = i, then protocol Pi is executed on the input x partitioned according 
to Hi, and its output is Pi{x) = f{x). 

The communication complexity of P is 



cc (Po, dJo) + maxjcc (P*, TIi) \ i = l,...,k}. 
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The fc-partition communication complexity of /, k-pcc(f), is the minimum over 
the communieation complexities of all k-partition protocols computing f. The 
multipartition communication complexity of / is pcc(/) := minfe{fc-pcc(/)}. 

A nondeterministic fc-partition protocol P for f is a collection of k deter- 
ministic protocols (Pi, -Pi), . . . , (Pfc, Pfc), where Ili is a balanced partition of X 
for i = 1, . . . ,k. For an arbitrary input x, the protocol P works as follows: 

(i) If f{x) = 0, then Pi{x) = 0 for all i = 1, . . . , k; and 

(a) if f{x) = 1, then there exists an i & {1 , . . . ,k} such that Pi{x) = 1. 

The communication complexity of P is [log k~\ + max{cc (Pi, Ili) | t = 1, . . . , fc}. 
The nondeterministic /c-partition communication complexity of f , k-pncc(f), is 
the cost of the best nondeterministic k-partition protocol for f . The nondeter- 
ministic multipartition communication complexity of / is defined as pncc(/) := 
minfe{fc-pncc(/)}. 

Note that it is important to add [log k~\ to the nondeterministic ^-partition 
communication complexity because the computers have to agree on the partition 
they use. If this agreement were for free, then our model would be as powerful 
as public advice nondeterministic communication protocols. 

In order to apply communication complexity for proving lower bounds, it is 
often convenient to consider the uniform version of protocols introduced and 
applied in j 1 1 II I ,'l) . Informally, a uniform protocol decides whether a given input 
w S S* belongs to a language L C E* for arbitrary partitions w = xy, where 
x,y € S*. For Boolean functions, we use the following definition. 

Definition 2. Let f be a Boolean function defined on the variables x\, . . . ,Xn, 
and let tt be a permutation of the set {1, . . . , n}, here called a variable ordering. A 
(deterministic) uniform communication protocol for / with variable ordering tt, 
denoted by (P, tt), is a collection of deterministic one-way communication proto- 
cols (Pi , Pi ), . . . , (P, 2— 1 5 Pn— l); Hi . ({^7r(l) J ■ ■ • : ^7r(z) }j {^7r(i-t-l) : ■ ■ • : ^7r(n) }) 

for i = 1, ... ,n — 1. The uniform communication complexity of / according to 
7T is ucc(/, tt) := max{cci(/, nf) | i = 1, . . . , n — 1}. 

A nondeterministic uniform fc-partition protocol P with variable orderings 
TTi, . . . , TTfe for a Boolean function f on n variables is a collection of k uniform 
protocols (Pi, 7Ti), . . . , (Pfc, TTfe), where for an arbitrary input x: 

(^) if f{x) = 0; then Pi{x) = 0 for all i = 1, . . . , k; and 

(a) if f{x) = 1, there exists an i G {1 , . . . ,k} such that Pi{x) = 1. 

The cost of P is [log k~\ -\- max{ucc(Pi, tt^) | i = 1, . . . , A:}. The nondeterministic 
uniform fc-partition communication complexity of /, fc-pnucc(/), is the cost of 
the best nondeterministic uniform k-partition protocol for f . 

We now present our results for multipartition communication complexity. 
Our goal is to compare usual nondeterministic protocols with nondeterministic 
multipartition protocols. 

On the one hand, it turns out that already between the deterministic two- 
partition model and the usual nondeterministic model without restrictions we 
have the maximal possible gap, i. e., constant versus linear complexity. Hence, 
multipartition protocols appear to be quite powerful. 
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It is much harder to obtain also a result in the opposite direction. As the second 
main result of the paper, we will show that a function which is easy for one-way 
nondeterministic protocols using at most log n advice bits has large complexity 
for nondeterministic uniform multipartition protocols which may use at most 
logarithmically many partitions: 

(i) For every n = w? -I- 1, m. S N, there is a Boolean function /„ on n variables 

such that 2-pcc(/„) = 0(1), but ncc(/„) = 

(ii) For every n = 6 ■ m S N, there is a Boolean function on n 

variables such that ncc[*°®"^ (g„) = O(logn); but k-pmicc{g„) = 

for all k < logn/5. 

These are results for explicitly defined functions. For the complexity measure 
fc-pncc we finally show the following non-constructive result: There are func- 
tions with complexity at most m + k for usual deterministic one-way protocols, 
but with complexity larger than m for nondeterministic 2^“^-partition protocols. 
Hence, one needs exponentially many partitions in the nondeterministic multi- 
partition model only to halve the complexity compared to the usual deterministic 
one-way model. 

Branching Program Complexity. Branching programs (BPs) are one 
of the standard nonuniform models of computation and an especially interesting 
“testing field” for lower bound techniques, see, e. g., 1271281 for an introduction. 
A branching program is a graph representing a single Boolean function. The 
complexity of a branching program, called branching program size, is the number 
of its nodes. Nondeterministic and randomized variants of branching programs 
are defined in a straightforward way by specifying a subset of the input variables 
whose values are chosen by coin tosses or nondeterministic guesses, resp. (see, 
e.g., |2S| for details). Such variables are called probabilistic or nondeterministic 
variables of the branching program, and we require that each such variable may 
appear at most once on each path from the source to a sink of the branching 
program. 

An oblivious branching program is a branching program with an associated 
sequence s of input variables (which may contain duplicates). For each path 
from the source to a sink of the BP, the sequence of variables on this path 
has to be a subsequence of s. The length of a path corresponds to the time of 
computation for this path, and oblivious BPs have originally been introduced to 
study time-space tradeoffs. A special time-restricted model are oblivious read-k- 
times branching programs which have the property that on every path from the 
source to one of the sinks each input variable appears at most k times. For k = 1, 
we obtain oblivious read-once BPs, better known as OBDDs (ordered binary 
decision diagrams) . In this case, the variable sequence s is simply a permutation, 
called the variable ordering of the OBDD. Lower bounds for oblivious BPs have 
been proven in |4lhll2llVllHllh| . Some of these papers also contain (or imply) 
lower bounds for nondeterministic oblivious BPs, and also a few results for the 
randomized case are known |2i:-!l24fA^j . 

A variant of the standard model of nondeterminism for oblivious BPs are par- 
titioned BDDs which have been originally invented for application purposes HS|. 
A k-partitioned BDD (fc-PBDD) is a branching program with a tree of nodes 
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labeled by nondeterministic variables at the top by which one of k OBDDs is 
chosen, where each of these OBDDs may use a different variable ordering. A 
partitioned BDD is a fc-PBDD for some k. This type of nondeterministic BPs 
is interesting for complexity theory because it allows a fine control of the avail- 
able amount of nondeterminism as well as a bounded non-oblivious access to the 
input variables. Theoretical results for partitioned BDDs have been proven by 
Bollig and Wegener |Sj. Among other results, they have shown that the classes of 
functions with polynomial size fc-PBDDs form a proper hierarchy with respect 
to /c. It is an open problem to compare this type of nondeterminism with usual 
nondeterministic OBDDs. 

We will apply the results on communication complexity proven here (together 
with known results) to attack some open questions concerning the power of non- 
determinism for the above types of BPs. As a tool, we use so-called overlapping 
eommunieation eomplexity introduced in which allows to prove lower bounds 
for oblivious read-fc-times BPs in a clean and transparent way superior to the 
previously used techniques. 

We first deal with the dependence of the size of nondeterministic OBDDs on 
the available amount of nondeterminism. For randomized OBDDs, it is known 
that O(logn) random bits are sufficient to exploit the full power of randomness 
for n-input functions m- We show here that imposing a logarithmic bound on 
the number of available nondeterministic variables may increase the OBDD size 
from (n/ log to . (Recently, it has been proven by a different tech- 

nique that even an increase from polynomial to exponential size is possible ES|.) 

Furthermore, we partially solve the open problem to compare nondetermin- 
istic OBDDs and fc-PBDDs by showing that these two kinds of nondeterminism 
are incomparable if fc is logarithmically bounded. 

Finally, we compare the power of the deterministic and the Las Vegas variant 
of oblivious read-fc-times BPs. It has already been proven that the size measures 
for deterministic OBDDs and Las Vegas OBDDs are polynomially related uni- 
on the other hand, for general (non-oblivious) read-once BPs an exponential gap 
could be established Here we shed new light on this surprisingly different 
behavior by showing that the size of oblivious (deterministic) read-fc-times BPs, 
where 2 < fc < logn/5 and n is the input size, may be superpolynomial in the 
size of oblivious Las Vegas read- 2-times BPs. 

This extended abstract is organized as follows. In Section 0 we present 
and discuss our results. The common technique behind the proofs of the lower 
bounds for uniform multipartition communication complexity and oblivious BPs 
is sketched in Section 0 For the full proofs of these lower bounds, as well as the 
involved constructions required for the upper bounds, we have to refer to the 
journal version of this paper. 



2 Results 

2.1 Communication Complexity 

The first main result of the paper is the asymptotically exact tradeoff between 
nondeterministic one-way communication complexity and the number of allowed 
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advice bits. Such a result is easy to prove if a logarithmic number of advice 
bits is sufficient to obtain small nondeterministic complexity (e. g., it holds that 
ncc’'(NE„) = 0(n/2’' + r) for the string-nonequality function NE„, see [14^1. 
Here we are concerned with the much harder case where superlogarithmically 
many advice bits are required. 

We consider the well-known function IND„ : {0,1}" x {l,...,nj — > {0,1} 
defined by IND„(a;, y) = Xy. The conjunction of s copies of this function, 
INDg_„: {0, 1}*" X {1, . . . , n}® ^ {0, 1}, is defined by 

INDs,„((x\ . . . , a;®), (y\ . . . , y®)) := IND„(a;\ y^) A • • • A IND„(a;®, y®), 

where a;^, . . . , x® G {0, 1}”, y^, . . . , y® € {1, . . . , n}. Our analysis yields the fol- 
lowing nearly exact bounds for this function: 

Theorem 1. For every n,r,s G N, 

(i) ncc{(INDs^„) < a ■ s ■ n - 2“*'/® -|- s -I- r, where a := 2/(e • ln2) < 1.062; and 

(ii) ncc{(INDs,„) > i • s • n • 2-"/® -b r. 

Our next two results deal with the relation between multipartition commu- 
nication complexity and usual deterministic and nondeterministic complexity. 

Consider the function MRC„: {0,1}" ^ {Oj 1} (“monochromatic rows 

or columns”) defined on a Boolean n x n-matrix X = (xij) and an additional 
variable z by 

MRC„(X, z) = (z A f\ (a;*,! = • • • = V (z A /\ {xi^i = ■■■ = Xn,i)) ■ 

Ki<n Ki<n 



Theorem 2. 

(i) 2-pcc(MRC„) < 3; but 

(ii) ncc(MRC„) > [n/-\/2j . 

Theoreml^ shows that already the possibility of a deterministic choice of one 
out of two partitions may be much more powerful than unrestricted nondeter- 
minism. Our next results lie in the opposite direction, i. e., we are going to prove 
limits on the power of multipartition protocols. We first describe a general con- 
struction technique which will allow us to derive variants of standard functions 
which are “hard” for multipartition protocols (and also for oblivious BPs). 

Definitions. Let a function Jn'. {0,1}" x {0,1}" — > {0,1}, n G N, 
he given, and let m > 2 be arbitrarily chosen. We define the function 
m-Masked-/„ : {0,1}^™" ^ {0,1} on Boolean vectors s = {si, . . . , Smn) , t = 
(ti, . . . , tmn), and z = (zi, . . . , Zmn) CIS follows. If either s or t do not contain ex- 
actly n ones, then we set m-Masked-/„(s, t, z) := 0. Otherwise, let i\ < ■ ■ ■ < in 
and ji < ■ ■ ■ < jn be the positions of ones in s and t, resp., and define 
m-Masked-/„(s, t, z) := f(zi ^ , . . . , z*„ , z^y , . . . , Zy„ ) • 

We apply this construction to the string-nonequality function NE„ : {0, 1}" x 
{0, 1}" — > {0, 1}, defined by NE„(a;,y) = 1 iff a: yf y. For arbitrary to > 2, we 
obtain a function TO-Masked-NE„ on Smn variables. 
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Theorem 3. Let m := 2 ■ and N := 6 ■ (the input size of 

TO-Masked-NE„/ Then 

(i) ncc[*°®^^ (m-Masked-NE„) = 0(log-/V); but 

(a) A:-pnucc(m-Masked-NE„) = /k) for all k < logN/5. 

For the multipartition model with uniform protocols, logarithmically many 
partitions may thus be exponentially weaker than logarithmically many usual 
advice bits. What happens if we replace uniform protocols by usual protocols? 
Although our present technique does not yield a result similar to Theorem 0 in 
this case, we can show that there must exist hard functions also for the general 
multipartition model by counting arguments. By choosing k = m = n/2 m the 
following theorem, we obtain that partitions may not be sufficient to halve 

the communication complexity compared to deterministic one-way protocols. 

Theorem 4. For all positive integers, A:, m, n G N with k + m < n, there exists 
a Boolean function : {0, 1}^" ^ {0, 1} such that 
(i) cci(F^’™) <m + k; and 

(a) 2^“^-pncc(F,J’'") > TO. 

2.2 Branching Programs 

We start by an analysis of the dependence of the size of nondeterministic OBDDs 
(nondeterministic oblivious read-once BPs) on the resource nondeterminism. 

Let UINDs^jj: {0,1}®" x {0, 1}®" ^ {0, 1} be defined as the variant of the 
function INDg^„ from above where we use a unary encoding for the “pointers” 
instead of a binary one. We consider the function 2-Masked-UINDs^„ on iV = 6sn 
variables obtained by applying Definition 0 with to := 2. Let N-OBDD”(/) 
be the minimum size of a nondeterministic OBDD for / that uses at most r 
nondeterministic variables. 

Theorem 5. 

(i) N-OBDD(2-Masked-UIND^,„) = 0(n®+3 • 2® • s^); hut 
(n) N-OBDD”(2-Masked-UINDs,„) = for all r e N. 

Choosing s := [logn], we obtain the following gap. 

Corollary 1. For N = 6[logn]n, the input size o/ 2-Masked-UIND|-iog 

(i) N-OBDD(2-Masked-UINDpog„y„) = (iV/log7V)0('°s^); hut 

(ii) N-OBDD’'(2-Masked-UINDpognl,n) = 2^^^^ if r = O(loglV). 

The next problem which we consider is the comparison of the usual form of 
nondeterminism in nondeterministic OBDDs with the nondeterministic choice 
of variable orderings in partitioned BDDs. Up to now, it has been open to find a 
concrete example for which fc-partitioned OBDDs are superior to the usual type 
of nondeterminism for any k. Here we show that even 2-partitioned BDDs may 
be exponentially smaller than usual nondeterministic OBDDs. Let A:-PBDD(/) 
denote the minimal size of a A:-partitioned BDD for a function /. 
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For the result in the opposite direction, we again consider the “masked vari- 
ant” of the string-nonequality function from Theorem 0 

Theorem 7 . Let m := 2 ■ 7\r ;= 0 . (^the input size of 

TO-Masked-NE„/ Then 

(i) N-OBDD(m-Masked-NE„) = 0{N^); but 

(a) /c-PBDD(m-Masked-NE„) = 2^^^ l'^) for every k < log A^/5. 

Finally, we show a superpolynomial gap between Las Vegas and determinism 
for oblivious read-fc-times BPs. Let NE„_„ be the function obtained by taking the 
conjunction of n disjoint string-nonequality functions NE„ (analogous to the def- 
inition of INDg^ji). We again apply Definition 0 with parameter m := 2- , 

obtaining functions m-Masked-NE„^„ on iV = 6 • 3^^^^°®"^ • input variables. 
Let fcOBP(/) (LV-/cOBP(/)) denote the minimal size of oblivious deterministic 
(Las Vegas, resp.) read-A:-times BPs for a function /. 

Theorem 8. Let m := 2 ■ 3^ri°g"l and N := 6 ■ 3^riog"-l . ^^/jg input size of 

TO-Masked-NE„_„/ Let a := 1/(2 log 3 -I- 2). 

(i) LV-20BP(m-Masked-NE„,„) = but 

(a) fcOBP(rn-Masked-NE„_„) = for every k < log A^/5. 

Hence, there is a sequence of functions f^- {0, 1}'^ ^ {0, 1} which is (explicitly) 
defined for infinitely many N such that, for every k < log N/b, 

kOBP{fN) = 2^(l°8"(LV-20BP(/„))/(fe-log=Ar))^ 

3 On the Proofs of the Lower Bounds 

In this section, we comment on the general technique used for proving the lower 
bounds on uniform multipartition communication complexity and on the size of 
oblivious BPs. 

It is well-known how results on communication complexity can be applied to 
prove lower bounds for models of computation which are read-once and oblivious. 
Several attempts have been made to extend this approach also to models with 
multiple read access to the input variables and with a limited degree of non- 
obliviousness; in the case of branching programs, e. g., in mm- Techniques 
which explicitly use the language of communication complexity theory are found 
in it™ . where the notion of overlapping communication complexity has been 
introduced, and in the monograph of Kushilevitz and Nisan mi- 

The idea common to these extended approaches is that partitions of the 
input variables are replaced by covers. Let T = (Xi,X 2 ) be a cover of the 
input variables from X, i.e., Xi U X 2 = X and X\, X 2 need not be disjoint. 
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Considering the communication problem where the computers Ci and Cu obtain 
inputs x: Xi ^ {0, 1} and y : X 2 {0, 1} such that x\xinX 2 = y\xinX 2 j one can 
define an overlapping protocol according to F analogous to the classical scenario 
where XinX 2 = 0 . The complexity of such a protocol is defined as the maximum 
of all valid input assignments, and the overlapping communication complexity of 
f with respect to F, occ(/, T), is the minimum complexity of an overlapping 
protocol for / according to F. 

Unfortunately, there are only a few functions for which lower bounds on the 
overlapping complexity could be proven so far, and the respective proofs often 
hide the fact that these bounds rely on lower bounds for usual communica- 
tion complexity. The approach followed for the results in this paper (based on 
H3I1ZI]) allows to apply known results for usual communication complexity to 
derive new lower bounds on overlapping communication complexity in a simple 
and straightforward way. 

We can only briefly sketch the ideas behind our technique here: 

(1) As an extension of the notion of rectangular reductions from communication 
complexity theory, we define generalized rectangular reductions in such a way 
that, if a generalized rectangular reduction from f to g exists, then lower 
bounds on the usual communication complexity of / yield lower bounds for 
the overlapping communication complexity of g. 

(2) For each considered model of computation (e. g., uniform multipartition pro- 
tocols, oblivious branching programs), we show how overlapping communi- 
cation complexity yields lower bounds on the respective complexity in the 
model. It turns out that it is sufficient to consider overlapping complexity 
with respect to a special type of covers, called alternating covers (which are 
also implicitly considered in the paper of Alon and Maass P| ) . 

(3) It remains to prove lower bounds on overlapping communication complexity 
with respect to alternating covers. At this point, our scheme for construct- 
ing “masked versions” of functions (Definition E|) comes into play. Together 
with the reductions from (1) and a “Ramsey- like” combinatorial lemma, this 
allows us to exploit the whole collection of results for usual communication 
complexity to obtain the required lower bounds on overlapping complexity. 

Here we only consider oblivious BPs as an example for the application of 
these ideas. Given an arbitrary sequence s of variables from X (possibly with 
duplicates) , partition this sequence into contiguous segments and number them, 
say from 1 to Then the alternating cover F = (Ai, A 2 ) of X with respect to 
s is defined by putting all segments with odd number into X\ and all segments 
with even number into A 2 . 

To carry out part (2) of the above plan, we construct an {I — l)-round over- 
lapping protocol from any given oblivious BP G with variable sequence s, using 
ideas from the papers nzna. Communication rounds correspond to sets of “cut 
nodes” in G, and thus it can be shown that [log |G|] > occ^_i( 5 , T)/(£ — 1), 
where g is the function represented by G. It remains to show that g has “high” 
overlapping complexity, which is done as described in (3) above. 

Putting all this together, we arrive at the following theorem which summa- 
rizes the proof technique for the special case of oblivious read-fc-times BPs. 
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Theorem 9. Let f be an arbitrary Boolean function defined on 2n variables, 
and let 77 be an arbitrary balanced partition of the variable set of f. Let k,p G N 
with k < p and define m := 2 ■ 3^^. Let G be an oblivious read-k-times BP for 
m-Masked-/ with respect to an arbitrary variable sequence. Then 

riog|G|l >cc2fe-i(/,77)/(27-l). 

Analogous assertions hold for nondeterministic and randomized variants of obliv- 
ious read-k-times BPs. 

Using this theorem, it is easy to obtain several new lower bounds for oblivious 
read-7-times BPs using the known results from communication complexity. The 
lower bounds in Theorem0and Theorem0are proven in this way. For Theorem0 
we need the additional observation that fc-PBDD(/) > fcOBP(/) for all Boolean 
functions / due to Bollig and Wegener ^ . The lower bound for nondeterministic 
uniform multipartition protocols in Theorem is proven by adapting the proof 
for partitioned BDDs to communication protocols and hence also relies on the 
same technique. 

Acknowledgement. Thanks to Ingo Wegener for the idea to look at the 
“Las Vegas versus determinism” problem for oblivious read-fc-times BPs and to 
Detlef Sieling for helpful comments on an early version of the proof of Theorem^ 

References 

1. H. Abelson. Lower bounds on information transfer in distributed computations. In 
Proc. of 19th IEEE Symp. on Eoundations of Computer Science (EOCS), 151-158, 
1978. 

2. F. Abiayev. Randomization and nondeterminism are incomparable for polyno- 
mial ordered binary decision diagrams. In Proc. of 24th Int. Coll, on Automata, 
Languages, and Programming (ICALP), LNCS 1256, 195-202. Springer, 1997. 

3. F. Abiayev and M. Karpinski. On the power of randomized branching programs. 
In Proc. of 23rd Int. Coll, on Automata, Languages, and Programming (ICALP), 
LNCS 1099, 348-356. Springer, 1996. 

4. N. Alon and W. Maass. Meanders and their applications in lower bonnds argu- 
ments. Journal of Computer and System Sciences, 37:118-129, 1988. 

5. L. Babai, N. Nisan, and M. Szegedy. Multiparty protocols, pseudorandom gen- 
erators for logspace and time-space trade-offs. Journal of Computer and System 
Sciences, 45:204-232, 1992. 

6. B. Bollig and I. Wegener. Complexity theoretical results on partitioned (nonde- 
terministic) binary decision diagrams. Theory of Computing Systems, 32:487-503, 
1999. (Earlier version in Proc. of 22nd Int. Symp. on Mathematical Foundations 
of Computer Science (MFCS), LNCS 1295, 159-168. Springer, 1997.) 

7. R. Canetti and O. Goldreich. Bounds on tradeoffs between randomness and com- 
munication complexity. Computational Complexity, 3:141 - 167, 1993. 

8. P. Duris and Z. Galil. On the power of multiple reads in a chip. Information and 
Computation, 104:277-287, 1993. 

9. P. Duris, Z. Galil, and G. Schnitger. Lower bounds on communication complexity. 
In Proc. of 16th Ann. ACM Symp. on Theory of Computing (STOC), 81-91, 1984. 



156 



Juraj Hromkovic and Martin Sauerhoff 



10. P. Duris, J. Hromkovic, J. D. P. Rolim, and G. Schnitger. Las Vegas versus deter- 
minism for one-way communication complexity, finite automata, and polynomial- 
time computations. In Proc. of Ifth Ann. Symp. on Theoretical Aspects of Com- 
puter Science (STAGS), LNCS 1200, 117-128. Springer, 1997. To appear in Infor- 
mation and Computation. 

11. R. Fleischer, H. Jung, and K. Mehlhorn. A communication-randomness tradeoff 
for two-processor systems. Information and Computation, 116:155-161, 1995. 

12. J. Gergov. Time-space tradeoffs for integer multiplication on various types of input 
oblivious sequential machines. Information Processing Letters, 51:265 - 269, 1994. 

13. J. Hromkovic. Communication Complexity and Parallel Computing. EATGS Texts 
in Theoretical Gomputer Science. Springer, Berlin, 1997. 

14. J. Hromkovic and G. Schnitger. Nondeterministic communication with a limited 
number of advice bits. In Proe. of 28th Ann. ACM Symp. on Theory of Computing 
(STOC), 551 - 560, 1996. 

15. J. Hromkovic. Communication complexity and lower bounds on multilective com- 
putations. Theoretieal Informatics and Applications (RAIRO), 33:193-212, 1999. 

16. J. Jain, J. Bitner, J. A. Abraham, and D. S. Fussell. Functional partitioning for 
verification and related problems. In T. Knight and J. Savage, editors. Advanced 
Research in VLSI and Parallel Systems: Proceedings of the 1992 Brown/MIT Con- 
ference, 210-226, 1992. 

17. S. P. Jukna. Lower bounds on communication complexity. Mathematical Logic and 
Its Applications, 5:22-30, 1987. 

18. M. Krause. Lower bounds for depth-restricted branching programs. Information 
and Computation, 91(1):1-14, Mar. 1991. 

19. M. Krause and S. Waack. On oblivious branching programs of linear length. In- 
formation and Computation, 94:232-249, 1991. 

20. K. Kriegel and S. Waack. Lower bounds on the complexity of real-time branching 
programs. Theoretical Informatics and Applications (RAIRO), 22:447-459, 1988. 

21. E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University 
Press, Cambridge, 1997. 

22. K. Mehlhorn and E. Schmidt. Las-Vegas is better than determinism in VLSI and 
distributed computing. In Proc. of Ifth Ann. ACM Symp. on Theory of Computing 
(STOC), 330 - 337, 1982. 

23. I. Newman. Private vs. common random bits in communication complexity. In- 
formation Processing Letters, 39:67 - 71, 1991. 

24. M. Sauerhoff. Complexity Theoretical Results for Randomized Branching Programs. 
PhD thesis, Univ. of Dortmund. Shaker, 1999. 

25. M. Sauerhoff. On the size of randomized OBDDs and read-once branching pro- 
grams for fc-stable functions. In Proc. of 16th Ann. Symp. on Theoretical Aspects 
of Computer Science (STAGS), LNCS 1563, 488-499. Springer, 1999. 

26. M. Sauerhoff. Computing with restricted nondeterminism: The dependence of the 
OBDD size on the number of nondeterministic variables. To appear in Proc. of 
FST & TCS. 

27. I. Wegener. The Complexity of Boolean Functions. Wiley- Teubner, 1987. 

28. I. Wegener. Branching Programs and Binary Decision Diagrams — Theory and 
Applieations. Monographs on Discrete and Applied Mathematics. SIAM, 1999. To 
appear. 

29. A. C. Yao. Some complexity questions related to distributive computing. In Proc. 
of 11th Ann. ACM Symp. on Theory of Computing (STOC), 209 - 213, 1979. 




The Boolean Hierarchy of NP-Partitions 

(Extended Abstract) 



Sven Kosub and Klaus W. Wagner 

Theoretische Informatik, Julius-Maximilians-Universitat Wurzburg 
Am Hubland, D-97074 Wurzburg, Germany 
{kosub , wagner}@inf ormatik . uni-wuerzburg . de 



Abstract. We introduce the boolean hierarchy of fc-partitions over NP 
for fc > 3 as a generalization of the boolean hierarchy of sets (i.e., 2- 
partitions) over NP. Whereas the structure of the latter hierarchy is 
rather simple the structure of the boolean hierarchy of fc-partitions over 
NP for k > 3 turns out to be much more complicated. We establish the 
Embedding Conjecture which enables us to get a complete idea of this 
structure. This conjecture is supported by several partial results. 



1 Introduction 

To divide the real world into two parts like big and small, black and white, 
or good and bad usually oversimplifies things. In most cases a partition into 
many parts is more appropriate. For example, take marks in school, scores for 
papers submitted to a conference, salary groups, or classes of risk. In mathe- 
matics, fc-valued logic is just a language for dealing with A:-valent objects, and 
in the computer science field of artificial intelligence, this language has become 
a powerful tool for reasoning about incomplete knowledge. In computational 
complexity for instance, proper partitions, although not mentioned explicitely, 
emerge in connection with locally definable acceptance types (cf. |S|). 

Nevertheless, complexity theoreticians mainly investigate the complexity of 
sets, i.e., partitions into two parts, or, the other extreme, the complexity of 
functions, i.e., partitions into usually infinitely many parts. But what about 
partitions into 3, 4, 5, . . . parts? 

This paper studies, as a first step in this direction, complexity classes of k- 
partitions which correspond to the classes of the boolean hierarchy of sets (i.e., 2- 
partitions). This investigation is justified by the fact that in the cases k > 3 there 
are interesting new phenomena which cannot be treated appropriately when 
encoding fc-partitions by sets. On the other hand, with the boolean hierarchy of 
sets we have a well-studied reference structure. 

The most general way to define the boolean hierarchy of sets over NP is as 
follows (see jSl): For a boolean function / : {0,1}™ — > {0,1} and sets Bi, , Bm 
define the set /(Bi,...,B„) hy =def /(c_Bi (a;), . . . , (a;)). The 

class NP(/) consists of all sets f{Bi , . . . , Bm.) when varying the sets Bi over NP. 
The boolean hierarchy (of sets) over NP consists of the classes NP(/). It was 
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proved in |S] that every class NP(/) coincides with one of the classes NP(z) or 
coNP(z), where NP(z) is the class of all sets which are the symmetric difference 
of i NP-sets. 

This approach is generalized in Sect. 3 to the case of ^-partitions. The char- 
acteristic function of a fc-partition A = {A \, . . . , is defined by ca{x) = i 
X G Ai. For a function / : {1, 2}™ — > {1, 2, . . . , fc} and sets Bi , . . . , Bm, taken 
as the 2-partitions {Bi,Bi) for every i G {1, . . . , to}, define a fc-partition A = 
f{Bi, Bm) by ca{x) =def /(cbi {x), ■ ■■ , cb^{x)). The boolean hierarchy of k- 
partitions over NP consists of the classes NP(/) =def { f{Bi , . . . , B^) \Bi,. . . , 
Bm G NP } . The boolean hierarchy of sets over NP now appears as the special 
case k = 2. 

Whereas the boolean hierarchy of sets over NP has a very simple structure 
(note that NP(i) UcoNP(z) C NP(i-|- 1) ncoNP(z-|- 1) for all i > 1), the situation 
is much more complicated for the boolean hierarchy of ^-partitions in the case 
k > S. The main question is: Can we get an overview over the structure of this 
hierarchy? This question is not answered completely so far, but in the remaining 
sections we give partial answers, and we establish a conjecture. 

A function / : {1,2}™ ^ (1,2,..., A:} which defines the class NP(/) of k- 
partitions corresponds to the finite boolean lattice ({1, 2}™, <) with the labeling 
function /. Generalizing this idea we define for every finite lattice G with labeling 
function / : G ^ (1, 2, . . . , fc} (for short: the fc-lattice (G, /)) a class NP(G, /) 
of fc-partitions. This does not result in more classes: In Sect. 4 we state that 
for every fc-lattice (G, /) there exists a finite function f such that NP(G, /) = 
NP(/'). However, the use of arbitrary lattices instead of only boolean lattices 
simplifies many considerations. 

To get an idea of the structure of the boolean hierarchy of fc-partitions over 
NP it is very important to have a criterion for NP(G, /) C NP(G',/') for fc- 
lattices (G, /) and (G', /'). In Sect. 5 we define a relation < as follows: (G, /) < 
(G', /') if and only if there is a monotonic mapping ip : G ^ G' such that f{x) = 
f'{(f{x)). We prove the Embedding Lemma which says that {G,f) < (G',f) 
implies NP(G, /) C NP(G',/'), and we establish the Embedding Conjecture 
which says that the converse is also true unless the polynomial-time hierarchy 
collapses. 

In Sect. 6 we collect evidence for our Embedding Conjecture. For fc = 2 we 
confirm this conjecture to be true. Moreover, we give a theorem which enables 
us to verify the Embedding Conjecture for fc > 3 for a large class of fc-lattices in- 
cluding all fc-chains. The proof of this theorem uses Kadin’s easy-hard-technique 
(cf. 0). 

Assume the Embedding Conjecture is true. Then the set inclusion structure 
of the boolean hierarchy of fc-partitions is isomorphic to the partial order of 
<-equivalence classes of fc-lattices with respect to <. In Sect. 7 we present the 
partial order of all 132 equivalence classes which contain boolean 3-lattices of 
the form ({1, 2}^, /). Furthermore, the partial order of equivalence classes of 3- 
lattices does not have bounded width. This gives an impression on the complexity 
of the (conjectured) structure of the boolean hierarchy of 3-partitions over NP. 
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2 Preliminaries 



For the classes K. and JC of subsets of a set M, define coJC = {l\L e /C }, 
K.AJC =def {AnB I A G /C,S G /C' }, and/C 0 /C' =def { AAS | A G IC,B eJC} 
where AAB denotes the symmetric difference of A and B. The classes JC{i) and 
coJC{i) defined by /C(l) =def Af and JC{i + 1 ) =def ^(*) © /C for i > 1 build 
the boolean hierarchy over 1C that has many equivalent definitions (see [iSI2l711 j i. 
Since JC{i) U coJC{i) C JC{i + 1) n co/C(z + 1) for alH > 1 and JC with M G 1C, the 
boolean hierarchy has a very clear structure. The class BC(/C) is the boolean 
closure of 1C, that is the smallest class which contains 1C and which is closed 
under intersection, union, and complementation. 

Further we need some notions from lattice theory and order theory (see e.g., 
0). A finite poset (G, <) is a lattice if for all x,y G G there exist (a) exactly 
one maximal element z G G such that z < x and z < y (which will be denoted 
by X Ay), and (b) exactly one minimal element z G G such that z > x and z > y 
(which will be denoted by x V y). For a lattice G we denote by Iq the unique 
element greater than or equal to all a; G G and by Oo the unique element less 
than or equal to all x G G. An element x yf 1 g is said to be meet-irreducible iff 
X = a A b implies x = a or x = b for all a,b G G. 

For symbols ai, 02 , ■ ■ ■ ,am from an alphabet E, we identify the m-tuple 
(ai, 02 , ... , am) with the word ai 02 . . . am C If there is an order < on E, we 
assume A"* to be partially ordered by the vector-ordering, that is ai 02 . . . am E 
6162 ... 6 m if and only if < bi for all i G {1, 2, . . . , m}. 

Finally, let us make some notational conventions about partitions. For any 
set M, a fc-tuple A = (Ai, . . . , A^) is said to be a k-partition of M if and only if 
Ai U A 2 U • • • U Afc = M and A^ n Ay = 0 \i i ^ j. The set Ai is said to be the 
i-th component of A. Let ca ■ M ^ {1, 2, . . . , fc} be the characteristic function 
of a /c-partition A = (Ai, . . . , A^) of M, i.e., ca{x) = f if and only if x G A^ for 
every x G M and 1 <i <k. For classes JCi,. . .,JCk C V{M), we define 

(/Cl, . . . , JCk) =def { A I A is a fc-partition of M and A^ G /Ci for 1 < / < fc } 
and for 1 < z < fc. 



(/Cl, . . . ,/Ci_i, -,/Ci+i, . . . ,/Cfc) =def (/Cl, . . . ,lCi-\,V{M),lCi+\, . . . ,/Cfc). 

For a class /C of fc-partitions, let ICi =def { A^ | A G /C } be the i-th projection of 
JC. Obviously, JC C (/Ci, . . . ,/Cfc). In what follows we identify a set A with the 
2-partition (A, A), and we identify a class JC of sets with the class (JC,coJC) = 
{JC, •) = (-,co/C) of 2-partitions. 



3 Partition Classes Defined by Finite Functions 

Let /C be a class of subsets of M such that 0, M G JC and JC is closed un- 
der intersection and union. In the literature, one way to define the classes of 
the boolean hierarchy of sets over /C is as follows (see 0). Let / : {1,2}"* ^ 
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{1, 2} be a boolean function. For i?i, . . . , Bm G JC the set f{Bi, . . . , Bm) is de- 
fined by Cf(Bi,...,Brr,)i^) = . . . ,cb^{x)). Then the classes /C(/) =def 

{ f{Bi, . . . , Bjn) \Bi,. . . , Bjn G /C } form the boolean hierarchy over 1C. Using 
finite functions / : {1, 2}™ ^ {1, 2, . . . , fc} we generalize this definition (remem- 
ber in which sense sets are 2-partitions) to obtain the classes of the boolean 
hierarchy of fc-partitions over K. as follows. 

Definition 1. Let k >2. 

1. For f : {1, 2}"* ^ {1, 2, . . . , fc} and for sets i?i, . . . , B^ € 1C, the k-partition 
f{Bi,...,Bm) is defined by Cf(B^,...,B^){x) = f{cBi{x), . . . ,cb„,{x)). 

2. For / : {1, 2}™ ^ {1, 2, . . . , fc}, the class of fc-partitions over K, defined by / 
is given by the elass JC{f) =def { f{Bi , . . . , Bm) \ Bi, . . . , Bm C JC} . 

3. The family BHfe(/C) =def { ^(/) \ f '■ {Ij 2}™ ^ {1, 2, . . . , fc} and m > 1 } is 
the boolean hierarchy of fc-partitions over 1C. 

4. BCfe(/C) =def UBHfe(/C). 

Obviously, if i G {1, . . . , fc} is not a value of / : {1, 2}™ ^ {1, 2, . . . , fc} then 
/C(/)i = { 0 }, i.e., /C(/) does not really have an i-th component. Therefore we 
assume in what follows that / is surjective. 

The following proposition shows that every partition in /C(/) consists of sets 
from the boolean hierarchy over 1C. This also justifies the use of the term boolean 
in the above definition. 



Proposition 1. Let/:{l,2}-^{l,2,...,fc}, fc>2. 

1. (/C, . . . ,/C) C /C(/) C (BC(/C), . . . ,BC(/C)). 

2. If JC is elosed under eomplementation then JC{f) = {1C, ... ,/C). 

3. BCfe(/C) = (BC(/C),...,BC(/C)). 

For fc = 2 the classes JC{f) of the boolean hierarchy BH 2 (/C) of sets (2- 
partitions) over 1C have been completely characterized. For / : {1, 2}™ ^ {1)2} 
let /i(/) be the maximum number of alternations of /-labels which can occur in 
a <-chain in ({1, 2}™, <). 

Theorem 1. 0 For / : {1, 2}"* ^ {1, 2}, 



^ff{2r)=2, 

\co/C(M/)),*//(2-) = l. 



Consequently, BH 2 (/C) = {/C(n)|n> l}u{ coJC{n) \n > 1 }, and given a 
function / : {1,2}™ ^ {1>2} it is easy to determine the class JC{n) or coJC{n) 
which coincides with JC{f). As mentioned above, the classes of BH 2 (/C) form a 
simple structure with respect to set inclusion. There do not exist three classes 
in BH 2 (/C) which are incomparable in this sense. 

It is the goal of this paper to get insights into the structure of the boolean 
hierarchy BHfe(NP) of fc-partitions over NP for fc > 3. What we can say at this 
point is, that already for fc = 3 the structure of BHfc(NP) with respect to set 
inclusion is not as simple as for fc = 2 (unless NP = coNP). This is shown by 
the following example. 
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Example 1. For a,b,c such that {a,b,c} = {1,2,3} define the function fabc '■ 
{1,2}2 ^ {1,2,3} by fabcin) = a, fabc(l2) = fabc{21) = b, and fabc{22) = c. 
Obviously, ^F{fabc)a = NP, NP(/abc)fc = NP(2), and ^P{fabc)c = coNP. Now 
let abc yf a'b'c'. If NP(/q;,c) = ^^ifa'b'c') then NP = NP(2) or NP = coNP, or 
NP(2) = coNP. In each of this cases we obtain NP = coNP. Consequently, if 
NP yf coNP then the six classes NP(/abc) are pairwise incomparable with respect 
to set inclusion. 

4 Partition Classes Defined by Lattices 

It turns out that, for / : {1,2}"* — > {1,2, ...,fc}, a fc-partition 

has a very natural equivalent lattice-theoretical definition. Consider the boolean 

lattice {1,2}"* with the partial vector-ordering <, and consider the function 

5 : {1,2}"* — > JC defined by S(ai, . . . , am) =def fla =i -^*1 where we define an 

intersection over an empty index set to be M. For an example see Fig. 1. Note 
that S(2, . . . ,2) = Af and S(a A 6) = S(a) C S(b) for all a,b G {1, 2}"*. Defining 
Ts(a) =def S(a)\ S(b) we obtain the i-th component of f{Bi , . . . , Bm) as 
f(Bi,.--,Bm)i = U/(q)=jTs(»). i-e-, can also be given by the 

function S : {1,2}"* ^ K. 

On the other side, if we have any function S : {1,2}"* — > K. such that 
S{2 , . . . , 2) = M and S{a A 6) = S{a) C S{b) for all a, 6 S {1, 2}"* we can define 
Bj =def 5’(2'^“^12"*“^) for j G {1,2,..., to}, and we obtain /(i?i, . . . , Bm)i = 
U/(a)=i ^s{i) for f e {1, 2, . . . , k}. In this manner the class /C(/) of fc-partitions 
is completely characterized by the labeled boolean lattice (({1, 2}*", <), /). 

In this section we will see that classes of /c-partitions can also be defined by 
weaker structures than boolean algebras. Again we always suppose /C to be a 
class such that 0,M G K. and which is closed under intersection and union. 

Definition 2. Let G be a lattice. 

1. A mapping S \ G ^ 1C is said to be a /C-homomorphism on G if and only if 
^(Ig) = M and S{a Ab) = S{a) C S{b) for all a,bGG. 

2. For a K. -homomorphism S on G, let Ts{a) =def B{a)\ Ub<a a G G. 

Lemma 1. Let G be a lattice, and let S be a K. -homomorphism on G. 

1. Ts{a) G K. A co/C for every a G G. 

2. S(a) = Uh<a Ts(b) for every a G G. 

3. The set of all Ts{a) for a G G yields a partition of M . 

4-. S is completely determined by its values for the meet-irreducible elements. 
That is, if S and S' are two IC-homomorphisms on G such that S{a) = S' (a) 
for all meet-irreducible a G G then S{a) = S' (a) for all a G G. 

Any pair (G, /) of an arbitrary finite poset G and a function f : G 
{1,2, ... ,k} is called a k-poset. A fc-poset which is a lattice (boolean lattice) is 
called a k-lattice {boolean k-lattice, resp.). 

Lemma n provides the soundness of the following definition. 
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Fig. 1. Partition defined by a boolean 3-lattice. 



Definition 3. Let (G,f) be a k-lattice, k>2. 

1. For a K, -homomorphism S on G, the fc-partition defined by (G, /) and S is 
given by (GJ,S) =def ( U/(a)=i ^s(a), ■ ■ • , U/(a)=fc ^s(a)) • 

2. /C(G, /) =def { (G, /, S') I S' is 1C -homomorphism on G } is the class of k- 
partitions defined by (G, /). 



Example 2. Consider the 3-lattice (G, /) in Fig. 2. The meet-irreducible elements 
of G are a, b, and c. By point 4 of Lemma^ every /C-homomorphism S : G —f JC 
is determined by fixing S(a) = A, S{b) = B, and S(c) = G. By the definition of 
/C-homomorphisms we get S(l) = M, S{d) = S{a Ab) = S{a) C S{b) = An B, 
and S(0) = S{d A c) = S(d) C S(c) = A n B (1 G . Furthermore, G = S(c) = 
S(c Ab) = S(c) n S{b) = C (iB, i.e., C C B. We obtain 

Ts{l) = M\{AU B) =An^, 

Ts{a) = A\{ADB) =AnB, _ 

Ts(b) = B\{{Ar]B)UC) =AnBnB, 

Ts{c) = G\{AnBnG) =ZnG, _ 

Ts(d) = {AnB)\{Ar]BnC) = adbdg, 

Ts\o) = AnBnG =AnG. 



Hence 

(G, /, S) = {Ts{a)U Ts{0)_^Ts{l) U Ts{c),Ts_{b) U Ts{d)) 

= {An {BCG), An {B\JC),BnC), 



and 

JC{G, f) = {{An{B\JG),An{B\JC),BnC)\A,B,C cJCandC CB] 

C (/C(3),co/C(3),/C(2)). 

The discussion at the beginning of the section yields the following proposition. 
Proposition 2. JC{f) = 2}™, <), /) for every f : {1,2}"* ^ {1,2,.. .,fc|. 
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Fig. 2. Partition defined by a 3-lattice. 



So, if (G, /) is a boolean fc-lattice then /C(G, /) = /C(/). But if (G, /) is 
an arbitrary fc-lattice, is /C(G, /) also of the form /C(/') for a suitable function 
/'? The following theorem says that this is generally true. This turns out to be 
very important for the further study of the structure of the boolean hierarchy 
of /c-partitions because instead of large boolean /c-lattices one can handle with 
usually much smaller equivalent fc-lattices. 

Theorem 2. For every k-lattice (G, /) there is an f : {1, 2}™ ^ {1, 2, . . . , fc} 
with /C(G, /) = lC{f), where m is the number of meet-irreducible elements of G. 

The proof of this theorem is an application of the Embedding Lemma below. 
In fact, it is enough to construct to any fc-lattice a boolean fc-lattice which is 
equivalent in the sense explained right in the next section. 



5 Comparing Partition Classes: The Embedding 
Conjecture 

To study the structure of the boolean hierarchy of /c-partitions over NP it would 
be important to have a criterion to decide whether NP(G, /) C NP(G',/') for 
any two /c-lattices (G, /) and (G',f). To this end we establish a relation < 
between ^-lattices. For ^-lattices (G, /) and (G', /') we write (G, /) < (G', /') if 
and only if there is a monotonic mapping tp : G ^ G' such that f{x) = f'{(p{x)) 
for every x € G. We write (G, /) = (G', /') and we say that (G, /) and (G', /') 
are equivalent if (G,f) < (G',/') and (G',/') < (G, /). 

The following lemma gives a sufficient condition for NP(G, /) C NP(G', /'). 

Lemma 2. (Embedding Lemma.) Let K be a class with M G K. and which 
is closed under finite union. Let (G,f) and {G' , f) be k-lattices. Lf{G,f) < 
{G',n, thenJC{G,f)CJC{G',f'). 



Example 3. The 3-lattice (G, /) shown in Fig. 1 and the 3-lattice (G', /') shown 
in Fig. 3 are equivalent. This can be seen as follows: Define the functions : G — > 
G' and -0 : G' ^ G by = p{12\) = p{211) = a, p{l\2) = (^(221) = b, 
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b 3 



a | 2 

Fig. 3. A 3-chain equivalent to the boolean 3-lattice in Fig. 1. 

if{l22) = if{2l2) = ip{222) = c, V'(a) = HI, tp{b) = 112, and V'(c) = 222. 
It is easy to see that ip and ■0 are monotonic, f{x) = f'{(p{x)) for all x £ G, 
and f{x) = f{ilj{x)) for all x £ G' . By the Embedding Lemma we obtain 
/C(G,/) = JC{G',f) for all JC. Obviously, /C(G',f) = {(B,A,B\A)\A,B £ 
/C and A C B } = (co/C,/C,-) = (co/C,/C,/C(2)). 

Let us come back to the Embedding Lemma which shows that (G, /) < 
(G', /') implies /C(G, /) C /C(G', /'). Because of Propositionn2, we cannot hope 
to convert this implication without an additional assumption to /C. Even an infi- 
nite boolean hierarchy of sets over K, is not sufficient to redirect the implication. 
To see this consider the class IS =def {{1,2, ...,n}|nGlN orn = oo}. Obvi- 
ously, IS is closed under intersection and union. Moreover, it is an easy exercise 
to confirm that BH 2 (IS) is strict. For BH 3 (IS) this is not true. 

Proposition 3. Let (G,f) and {G' , f) be the 3-lattices shown in Fig. 0. Then 
IS(G, /) = IS(G ', n but (G, /) ^ (G', f ). 

Up to this proposition, all results so far hold for arbitrary classes with some 
simple closure properties. The forthcoming now makes use of the very nature of 
the class NP. Since the collapse of the boolean hierarchy over NP implies the 
collapse of the polynomial-time hierarchy (cf. (3) the following conjecture seems 
to be reasonable. 

Embedding Conjecture. Assume the polynomial-time hierarchy does not col- 
lapse. Let (G, /) and (G',/') be ^-lattices. Then NP(G, /) C NP(G',/') if and 
only if (G,/)<(G',f). 

To provide evidence for the Embedding Conjecture we formulate in the next 
section a theorem (Theorem 0 which shows that the Embedding Conjecture is 
true for a large subclass of all fc-lattices including all 2-lattices (Corollary |21) and 
all /c-chains (Corollary P). Furthermore, the 3-lattices in Fig. 4 turn out to be 
no counterexample. 



6 Evidence for the Embedding Conjectnre 

We establish a theorem which shows that the Embedding Conjecture is true for a 
large subclass of fc-lattices. Proving this theorem, we detect some normal forms of 
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(hypothetical) inclusions between partition classes enabling us a generalization of 
the easy-hard-arguments developed by Kadin (cf. |B| ) to the context of partition 
classes. A fc-chain is called repetition- free iff neighbored elements have different 
labels. 

Theorem 3. Assume that the polynomial-time hierarchy does not collapse. Let 
(G,f) and be k-lattices. //NP(G, /) C NP(G',/') then every repetition- 

free k-subchain of (G,f) occurs as a k-subchain of {G',f'). 

Theorem El easily gives that the 3-lattices in Fig. 2 and Fig. 3 define incompa- 
rable partition classes over NP, unless the polynomial-time hierarchy collapses. 
The following corollary shows that the Embedding Conjecture is true for k- 
chains. 

Corollary 1. Assume the polynomial-time hierarchy does not collapse. For k- 

chains (G,f) and (G',/') it holds that NP(G, /) C NP(G',/') if and only if 

(G,/)<(G',f). 

Furthermore, the Embedding Conjecture is generally true for 2-lattices. This 
is a consequence of Theorem El and the following simple proposition. 

Proposition 4. Every 2-lattice is equivalent to its longest chain with alternating 
labels 1 and 2. 

Corollary 2. Assume the polynomial-time hierarchy does not collapse. For 2- 

lattices (G, /) and (G', /') it holds that NP(G, /) C NP(G', /') if and only if 

(G,/)<(G',f). 

Assume the polynomial-time hierarchy does not collapse. By Theorem El if 
the fc-lattice (G, /) has a repetition-free fc-subchain which is not a A:-subchain 
of the fc-lattice {G',f) then NP(G, /) 2 NP(G',/'). But what about ^-lattices 
which have the same repetition-free fc-subchains? For example, take the 3-lattices 
(G, /) and (G', f) presented in Fig. 4. Since (G, /) ^ (G', /') the Embedding 
Conjecture says that NP(G, /) % NP(G',/'), but TheoremEI does not help to 
prove this. However, this can be proved by a clever exploitation of the situation 
in order to simplify the self-reduction tree of the satisfiability problem. The proof 
is inspired by a work of Hemaspaandra et al. |^. 

Theorem 4. Let (G,f) and {G' , f) be the 3-lattices shown in Fig. 4- Then it 
holds that NP(G, /) C NP(G', /') if and only i/NP = coNP. 



166 Sven Kosub and Klaus W. Wagner 



7 On the Structure of the Boolean Hierarchy of 
fc-Partitions for fc > 3 

Assume the Embedding Conjecture is true. Then the structure of the boolean 
hierarchy of A:-partitions with respect to set inclusion is identical with the partial 
order of <-equivalence classes of ^-lattices with respect to <. To get an idea of 
the complexity of the latter structure we will now present the partial order of 
all equivalence classes of 3-lattices which include a boolean 3-lattice of the form 
({1,2}^,/) with surjective / (for non-surjective / these fc-lattices do not really 
define 3-partitions). The 5796 different boolean 3-lattices of the form ({1, 2}^, /) 
with surjective / are in 132 different equivalence classes. 

Figure 5 shows the partial order of the 44 equivalence classes which contain 
boolean 3-lattices of the form ({1,2}^,/) such that /(1, 1,1) = 1. The cases 
/(1, 1,1) = 2 and /(1, 1,1) = 3 yield isomorphic partial orders. A line from 
equivalence class G up to equivalence class Q' means that (G,f) < {G' , f) for 
every (G,f) G Q and {G' , f) G Q' . We emphasize that such a study would be 
intractable without the possibility to present boolean fc-lattices by equivalent 
fc-lattices. 




Figure 6 shows the right part of the partial order in Fig. 5 where each equiv- 
alence class is represented by a minimal equivalent 3-lattice. The left part of the 
partial order in Fig. 2 is symmetric to the right part where the labels 2 and 3 
change their role. 

Theorem 5. Assume the polynomial-time hierarchy does not collapse. If there 
is a solid line from class Q up to class Q' in Figure 6 then NP(C?, /) C NP(G', /') 
for every (G,f) G G and (G',/') G G' ■ 

Every “solid line” in this theorem is an application of Theorem El besides the 
one marked by * which is just Theorem 0 

At the end of this section we mention that the partial order of equivalence 
classes of 3-lattices does not have “bounded width” . 
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Proposition 5. For every m G IN there exist at least m 3-lattices that are 
incomparable with respect to <. 

8 Conclusion 

In the preceding sections, we have investigated the boolean hierarchy of k- 
partitions over NP for fc > 3 as a generalization of the boolean hierarchy of 
sets (i.e., 2-partitions) over NP. Whereas the structure of the latter hierarchy 
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is rather simple the structure of the boolean hierarchy of fc-partitions over NP 
for A: > 3 turned out to be much more complicated. We established the Embed- 
ding Conjecture which enables us to get an overview over this structure. This 
conjecture was supported by several partial results. A complete proof of or a 
counterexample to the Embedding Conjecture for NP are left to find. 

Finally, let us mention that partitions of classes NP(G, /) can be accepted in 
a natural way by nondeterministic polynomial-time machines with a notion of 
acceptance which depends on the /c-lattice (G, /). As a consequence one can show 
that all these classes have complete partitions with respect to an appropriate 
<P -reduction. 
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Abstract. Goodman, Greenberg, Madras and March gave a lower bound 
of for the maximum arrival rate for which the n-user binary 

exponential backoff protocol is stable. Thus, they showed that the proto- 
col is stable as long as the arrival rate is at most \Ye improve 

the lower bound, showing that the protocol is stable for arrival rates up 
to 0(n“ ®). 

1 Introduction 

A multiple-access channel is a broadcast channel that allows multiple users to 
communicate with each other by sending messages onto the channel. If two or 
more users simultaneously send messages, then the messages interfere with each 
other (collide), and the messages are not transmitted successfully. The channel is 
not centrally controlled. Instead, the users use a contention-resolution protocol to 
resolve collisions. Thus, after a collision, each user involved in the collision waits a 
random amount of time (which is determined by the protocol) before re-sending. 
Perhaps the best-known contention-resolution protocol is the Ethernet protocol 
of Metcalfe and Boggs P). The Ethernet protocol is based on the following simple 
binary exponential backoff protocol. Time is divided into discrete units called 
steps. If the Pth user has a message to send during a given step, then it sends 
this message with probability where bi denotes the number of collisions that 
this message has already had. With probability 1 — user i does not send 
during the step. The Ethernet protocol is based on binary exponential backoff, 
but some modifications are made to make it easier to build. See [bl9j for details. 

Hastad, Leighton and Rogoff jSI have studied the performance of the binary 
exponential backoff protocol in the following natural model. The system consists 
of n users. Each user maintains a queue of messages that it wishes to send. At 
the beginning of the Pth time step, the length of the queue of the Pth user 
is denoted qi{t) and the number of times that the message at the head of its 
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queue has collided is denoted bi{t). At the beginning of the t’th step, each queue 
receives 0 or 1 new messages. In particular, a new message is added to the end 
of each queue independently with probability A/n, where A is the arrival rate of 
the system. After the new messages are added to the queues, each user makes 
an independent decision about whether or not to send the message at the head 
of its queue, using the binary exponential backoff protocol. (If the message at 
the head of the i’th queue has never been sent before then bi = 0, so it is now 
sent. Otherwise, bi = bi{t), so it is sent independently with probability 
If exactly one message is sent (so there are no collisions), then this message is 
delivered successfully, and it leaves its queue. Otherwise, the messages that are 
sent collide and no messages are delivered successfully. 

Since the arrivals are modelled by a stochastic process, the evolution of the 
whole system over time can be viewed as a Markov chain in which the state just 
before step t is X{t) = ((gi(t), . . . , g„(t)), (6i(t), . . . , 5„(t))) and the next state 
is AT(t + 1). One measure of the performance of the system is the expectation of 
the random variable Tj-eti which is the number of steps required for the system to 
return to the start state AT(0) = ((0, . . . , 0), (0, . . . , 0)). Hastad et al. |S| proved 
that if the arrival rate is too high, then the system is unstable, in the sense that 
the expected recurrence time is infinite. 

Theorem 1 (Hastad, Leighton, and Rogoff). Suppose that for some posi- 
tive e, A > ^ + e. Suppose that n is sufficiently large (as a function of e). Then 
E[Trel\ = oo. 

On the other hand, Goodman, Greenberg, Madras and March 0 showed that 
if the arrival rate is sufficiently low, then the system is stable. 

Theorem 2 (Goodman, Greenberg, Madras and March). There is a pos- 
itive constant a such that E[Tfgf\ is finite for the n-user system, provided that 

x<^. 

While Goodman, Greenberg, Madras, and March’s result is the only known 
stability result for the finitely-many-users binary-exponential-backoff protocol, 
their upper bound (A < „ ) is very small. In this paper, we narrow the gap 

between the two results. In particular, we prove the following theorem. 

Theorem 3. There is a positive constant a such that, as long as n is sufficiently 
large and A < then if [T^g^] is finite for the n-user system. 

The point of Theorem 0 is to show that n-user Binary Exponential Backoff is 
stable for arrival rates which grow faster asymptotically than 1/n. That is, the 
purpose of the result is to show that, for positive constants a and rj, X < 
guarantees stability. We have chosen rj — .1 for concreteness. We believe that 
the same methods could be used for slightly larger values of rj, but an interesting 
(and difficult) question raised by this work is whether the same result would be 
true for ry = 1. That is, is there a constant a such that the n-user system is 
stable whenever A < A? 

oc 

The organisation of the paper is as follows. In Section El we summarise other 
related work. In Section 0 we give the proof of Theorem 0 
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2 Related Work 

We now summarize some other related work. We start by observing that the 
results in Theorem ^ and |3 can be extended to more general models. For exam- 
ple, the result of Goodman et al. can be extended to a more general model of 
stochastic arrivals in which the expected number of arrivals at user i at time t 
(conditioned on all events up to time t) is a quantity, Ai, and is required 

to be equal to A. The result of Hastad et al. can be extended to small values 
of n, provided that A > .568-1- l/(4n — 2). The instability result of Hastad et al. 
implies that, when A is sufficiently large, the expected average waiting time of 
messages is infinite. 

Next, we mention that the binary exponential backoff protocol is known 
to be unstable in the infinitely-many-users Poisson-arrivals model. Kelly and 
MacPhee mi showed this for A < In 2 and Aldous P showed that it holds for 
all positive A L|. 

Finally, we mention that, while the goal of this paper is to understand the 
binary-exponential backoff protocol, on which Ethernet is based, there are other 
acknowledgement-based protocols which are known to be stable in the same 
model for larger arrival rates. In particular, Hastad et al. have shown that 
polynomial-backoff protocols are stable as long as A < 1. The expected wait- 
ing time of messages is high in polynomial-backoff protocols, but Raghavan and 
Upfal fD| have given a protocol that is stable for A < 1/10, in which the ex- 
pected waiting time of every message is O(logn), provided that the users are 
given a reasonably good estimate of log n. Finally, Goldberg, MacKenzie, Pater- 
son and Srinivasan ^ have given a protocol that is stable for A < 1/e, in which 
the expected average message waiting-time is 0(1), provided that the users are 
given an upper bound on n. 

We conclude by observing that the technique of Goldberg and MacKenzie 0 
can be used to extend Theorem 0so that it applies to a non-geometric version of 
binary-exponential backoff, which is closer to the version used in the Ethernet. 
(Instead of deciding whether to send on each step independently with proba- 
bility 2“^% the user simply chooses the number of steps to wait before sending 
uniformly at random from [1, . . . , 2*'*].) The ideas are the same as those used in 
the proof that follows, but the details are messier. Our result can also be ex- 
tended along the lines of to show that, when A is sufficiently low, the expected 
average message waiting time is finite. 



^ Note that it can be misleading to view the infinitely-many-users model as the limit 
(as n tends to infinity) of the n-users model. For example, the “polynomial backoff” 
protocol is known to be unstable (for any positive A) in the infinitely-many-users 
Poisson-arrivals model □HI, but it is stable (for any A < 1) in the n-users model 0. 
Thus, Aldous’s result does not rule out the possibility that there is a positive con- 
stant A* such that the n-user binary exponential backoff protocol is stable whenever 
A < A*. 
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3 The Stability Proof 

In order to prove Theorem|3 let A = / 9 , where a' > a. We will now define the 

relevant potential function. Let f{X{t)) be the following function of the state 
just before step t. 



We will use the following generalisation of Foster’s theorem [2|. Note that the 
Markov chain X satisfies the initial conditions of the theorem. That is, it is 
time-homogeneous, irreducible, and aperiodic and has a countable state space. 

Theorem 4 (Foster; Fayolle, Malyshev, Menshikov). A time-homogeneous 
irredueible aperiodic Markov chain X with a countable state space A is positive 
recurrent iff there exists a positive function f{p), p & A, a number e > 0, a 
positive integer-valued function k{p), p G A, and a finite set C C A, such that 
the following inequalities hold. 

E[f{X{t + k{X{tm - f{X{t)) I X{t) = p]< -ek(p),p ^ C (1) 

E[f{X{t + k{X{tm I X{t) = p)]<oc,pG C. (2) 



We use the following notation, where /3 = 3. For a state X{t), let m{X{t)) de- 
note the number of users i with qi{t) > 0 and bi{t) < lg/3-|-lgn, and let m'{X{t)) 
denote the number of users i with qi{t) > 0 and bi{t) < .Sign -I- 1. Note that 
m'{X{t)) < m{X{t)). We will take e to be 1 — 2/a and C to be the set consisting 
of the single state ((0, . . . , 0), (0, . . . , 0)). We define fc(((0, . . . , 0), (0, . . . , 0))) = 1, 
so Equation 13 is satisfied. For every state p ^ C, we will define k{p) in such a 
way that Equation ^ is also satisfied. We give the details in three cases. 

3.1 Case 1: m'{X(t)) — 0 and m{X{t)) < n'*. 

For every state p such that m'{p) = 0 and m(p) < n ® we define k{p) = 1. We 
wish to show that, ii p ((0, . . . , 0), (0, . . . , 0)) and X{t) = p, then E[f{X{t -\- 
1) — f{X{t))] < —e. Our general approach is the same as the approach used 
in the proof of Lemma 5.7 of |0|. For convenience, we use m as shorthand for 
m{X{t)) and we use i to denote the number of users i with qi{t) > 0. Without 
loss of generality, we assume that these are users !,...,£. We use pi to denote 
the probability that user i sends on step t. (So pi = if j G [!,...,£] 

and Pi = X/n otherwise.) We let T denote nr=i(l ~ Pi) we let S denote 

Sr=i Note that the expected number of successes at step t is ST. Let Ia,i,t 
be the 0/1 indicator random variable which is 1 iff there is an arrival at user i 
during step t and let be the 0/1 indicator random variable which is 1 iff 
user i succeeds in sending a message at step t. Then 



Binary Exponential Backoff Is Stable for High Arrival Rates 



173 



E[f{x{t+i))-f{xm 

n n 

= an'-s ^ - Is,i,t]) + E 

n 

= - 5T) + E - 1)7t,) , 

i==l 

= an^-\\ - 5T) + E f2'’-(*)p,(l - -^) - , 

\ ^ Pi ^ Pi J 



= an-(A-5T) + E(l-^)+ E -(l-^)-^T, 

2=1 2=^+1 



= cmi-®(A-ST)+^-^T + 



(n — £)A 



-We 



1 



E 






1 — n, 1 — Pi 

i=l ^ i=^+l ^ / 



r^-®(A - ST) + ^ - ^ -ST- £T, 

n 

jI.Sa + ^ _g (n-£)X _ 2i), 



(3) 



(4) 



where (Ji in Equality 0 denotes the probability that user i collides at step t 
and TTi denotes the probability that user i sends successfully at step t. (To see 
why Equality 0 holds, note that with probability ai, bi(t + 1) = bi(t) + 1, with 
probability tti, bi{t+l) = 0, and otherwise, bi{t + l) = bi{t).) We now find lower 
bounds for S and T. First, 



^ = E 



Pi 

I - Pi 




2-bi{t) 

1 _ 2-biW 



A(n — £) 
n — X 




X{n — i) 
n — X 






j3n — 1 n — X 



(5) 



Next, 



T^l[{l-Pi) 



>(1 



1 m ^ 

2n-®^ ^ /3n^ 



m \ 

(1-1) 

n 



n—i 



m £ — m X{n — £) 
2n® j3n n 



( 6 ) 
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Combining Equations 0 0 and El we get the following equation. 



E[f{X{t + 1) - f{X{t))] < an^-^X + i + 
i — m A(n — - 



{n — £)X 



( 7 ) 



1 - 



m 



j3n 



+ 1 ) 



m \{n — £) 



(3n — 1 n — X 



■ 2 £ 



We will let g(m,£) be the quantity in Equation 0 plus e and we will show that 
g{m, £) is negative for all values of 0 < m < n ® and all £ > m. In particular, for 
every fixed positive value of to, we will show that 

1. g{m,m) is negative, 

2. g{m, n) is negative, and 

3- ^g{'m,£) > 0. {g{m,£) is concave up as a function of £ for the fixed value 
of TO so g{m, £) is negative for all £ G [to, n].) 

We will handle the case to = 0 similarly except that m = £ = 0 corresponds to 
the start state, so we will replace Item 1 with the following for to = 0. 

1’. g(0, 1) is negative. 

The details of the proof are now merely calculations. 



1. g{m,m) is negative: ^(to, to) x 2a'n^ ®(/3n — — 1) is equal to the 

following. 

— 2to — 4m^ + 2n + 6mn + 2/3TOn+6/3m^n+2a^m^n^’^ + 2aTn^’® + 2Q^m^n^'® 

— 2n^ — 2(3n^ —SPmn^ —a rnn^'^ —3a +2ari^'^ + 2amn^'^ + 2al3m^n^'^ 

— loL pe n —la mn +2a pmn —Aa pm n -\-2pn —a m n -\-a pmn 

— laa m n —Ian —lapn —la en —4 apmn —aa mn -}-4a pmn 

— aa /3rn^n^'^ +2,a'^ j3rn^ + 2aa ^-^aa' jdrnn^''^ + 2a/3'nf‘'^ +2a"^ j3en^'^ 

— 2a'^ f3mn'^'^ +aa'^m^n‘^'^ +aa' (3mn^'^ — 2aa'^mn^'^ 



The dominant term is —2aa'^mn^'^. Note that there is a positive term 
(aa'^TO^n^ ®) which could be half this big if to is as big as n'® (the upper 
bound for Case 1), but all other terms are asymptotically smaller. 

2. g{m, n) is negative: g{m, n) x 2a' j3n{[3n — 1) is equal to the following. 



— 2a + a' !3m? n'^ — 2a' Pen + 6a' mn — 2a' Pmn — 2a' Pmn^'^ 

— 2aa' m?n^'^ — 2aPn^'^ — 4a'n^ + 2a' Pn^ + 2a' P^en'^ 

— 3a Pmn^ + aa' Pm?n? + 2a' P^mn'^'^ + 2aa mn^'^ — 2aa' Pmn^'^ + 2aP^n^'^ 
+ 4a' Pn^ — 2a' P^n^ 

Since P > 2, the term — 2a'pn® dominates +4a'/3n®. For the same reason, 
the term— 2aQ;'/3TOn^ ® dominates the two terms+2aa'TOn^ ® &nd+aa' PmPrP . 
The other terms are asymptotically smaller. 
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3 - ^g{rn,t) > 0 : 



92 

dP 



g{m,e) = 2 



J_ _ A 

Pn n 



,1.8 



2 - 



1)A 



-A 



1’. y(0, 1) is negative: ^(0, 1) x a' {a' rr'^ — 1) is equal to the following. 



+ 4/3 — 3a'n'® — 5/3n + a/3n^'® + — a' — a' PerP'^ 

+ PrP — aa'rP''^ + 2a'^rP'^ — 3aPn^'^ + 2a' PrP'^ + aa'tP'^ 

+ aa' PrP'"^ + aPrP'^ — a'^Pn^'^ + a"^ PerP'^ 

Since a'(l — e) > a(l — e) > 1, the term — a'^/3(l — e)n^ ® dominates the term 
+a/3n^ ®. The other terms are asymptotically smaller. 



3.2 Case 2: m{X{t)) > n'* or m'{X{i)) > n'^. 

For every state p such that rn{p) > n ® or m'{p) > we will define an integer k 
(which depends upon p) and we will show that, if X{t) = p, then E\f{X{t + 
k) — f{X{t))] < —ek, where e = 1 — 2/a. 

For convenience, we will use m as shorthand for m{X(t)) and m! as shorthand 
for m'{X{t)). If m > n ® then we will define r = m, W = m^/^|"lgr]2“®, A = W, 
b = lg P+lg n and v = n. Otherwise, we will define r = m' ,W = |"lg r] 2“®, A = 0, 
b = .8 lgn+1, and v = 2[n ®] . In either case, we will define k = 4{r+v) [Igr] . Let 
T be the set of steps {t, . . . ,t + k — 1} and let S be the random variable which 
denotes the number of successes that the system has during r. Let p denote 
Pr(5 > W). Then we have 



n t-\-k 

E[f{X{t + k)-f{X{t))]<arp-^\k-an^-^E[S] + Y^ ^ 

< arP'^Xk — arP'^Wp + kn 

< —ek, 

where the final inequality holds as long as ap> 2^® and n is sufficiently big (see 
the Appendix). Thus, it suffices to find a positive lower bound for p which is 
independent of n. We do this with plenty to spare. In particular, we show that 
p> 1-5 X 10"®. 

We start with a technical lemma, which describes the behaviour of a single 
user. 

Lemma 1. Let j be a positive integer, and let S be a positive integer which is 
at least 2. Suppose that qpt) > 0. Then, with probability at least 1 — 
either user i succeeds in steps [t, . . . ,t + Sj |"lg j] — 1], or bpt + Sj\lgj~\) > |"lg j] . 
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Proof. Suppose that user i is running in an externally-jammed channel (so every 
send results in a collision). Let denote the number of steps t' G [t,. . . ,t + 
[(5jlg(j)]] with bi(t') = z. We claim that Pr(Xz > <5|"lg 

This proves the lemma since < ^jTlgjl ■ To prove the claim, 

note that Xq < 1, so Pr(Xo > (5|"lg = 0 < ^ > 0, note that 

Pr(A:^ > (5[lgj]2^”^) < (1 - < j—5/(21n2)^ 

Next, we define some events. We will show that the events are likely to occur, 
and, if they do occur, then S is likely to be at least W. This will allow us to 
conclude that p > 1 — 5 x 10“®, which will finish Case 2. We start by defining 
B = \W~\ + [A] , A:' = 4r [Ig r] , k" = 4B\lgB~\ and tq = {t, . . . ,t + k' — 1}. Let 
t'(z) be the set of all t' G t such that bi{t') = 0 and either (1) qi{t') > 0 or (2) 
there is an arrival at user i at t' . Let T 2 be the set of all t' G t such that \{{t" , i) \ 
t" G T'{i) and t" <t'}\> B. Finally, let n be the set of all t' Gt — tq — T 2 such 
that, for some i, r'fi) n \t' — k” + 1, t'\ ^ 0. We can now define the events Ijn-fiEI 

El. There are at most A arrivals during r. 

E2. Every station with qi{f) > 0 and bi{t) < b either sends successfully during tq 
or has bi{t + k') > [Ig r] . 

E3. Every station with qi{t) > 0 and bi{t) < b has bi{t') < 6 -I- lg(r)/2 -|- 3 for all 
t' G T. 

E4. For all t' G r'(i) and all t" > t' such that t" G t — ti — T 2 , bi{t') > [IgS] . 

Next, we show that EH1-E0 are likely to occur. 

Lemma 2. If n is sufficiently large then Pr(F0) < 10“^. 

Proof. The expected number of arrivals in r is Xk. If m > n ®, then A = 
77j,i/4|'ig7.j2“® > 2Xk. By a Chernoff bound, the probability that there are this 
many arrivals is at most < 10“®. Otherwise, A = 0 and Xk = o(l). Thus, 

Pr(LEl) > (1 - A/n)"'^ > 1 - Afc > 1 - 10 "®. 

Lemma 3. If n is sufficiently large then Pr(F|^ < 10“®. 

Proof. Apply Lemma Q to each of the r users with i5 = 4 and j = r. Then 

Pr(^ < r < 10-5. 



Lemma 4. If n is sufficiently large then Pr(F0 <10 

Proof. Let y = Jill . Note that 2y < -|- 3. Suppose that user i has bi(f') > 

b + 2y. Then this user sent when its backoff counter was \b + z~\ for all z G 
{y, ... ,2y — 1}. The probability of such a send on any particular step is at most 
2 ^. Thus, the probability that it makes all y of the sends is at most 




ke \ 
2by2y ) 



V 



< 10 "®/^- 



Thus, the probability that any of the r users obtains such a big backoff counter 
is at most 10“®. 
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Lemma 5. If n is sufficiently large then Pr(iQ < 10 

Proof. We can apply Lemma E separately to each of the (up to B) pairs (t',i) 
with 6 = 4 and j = B. The probability that there is a failure is at most ^ 

10 - 5 . 



We now wish to show that Pr(5 < W \ El A E2 A E3 A E4) <10 We begin 
with the following lemma. 



Lemma 6. Given any fixed sequence of states X(t), . . . ,X(t + z) which does 
not violate E2 or E4-, and satisfies t + z G t — tq — t\ — T 2 , qfit + z) > 0, and 

bi{t + z) <b + lg(r)/2 + 3, the probability that user i succeeds at step t + zisat 

least 2102^7.1/2 ■ 

Proof. The conditions in the lemma imply the following. 

— There are no users j with bjft + z) < [Ig B~\ . 

— There are at most B users j with bj{t + z) < [Igr] . 

— There are at most r + B users j with bj (t + z) < b. 

— There are at most m + B users j with bj {t + z) < Ig /3 + Ig n. 



Thus, the probability that user i succeeds is at least 

B 



2_(6+lg(r)/2+3) [ ^ _ 



1 - 



1 



1 - 



1 - 



fin 



'i—m — B 



> 



> 



1 



2 ^ 1/223 

1 

2102^1/2- 



1 - 



n — m — B 



fin 



Corollary 1. Given any fixed sequence of states X(t), . . . , X(t + z) which does 
not violate E2, E3, or Ef, and satisfies t + z G t — tq — ti — T2, the probability 
that some user succeeds at step t + z is at least ^ 

Proof. Since t + z ^ T 2 , at least r — B of the users i with qi{t) > 0 and bfit) < b 
have not succeeded before step t + z. Since E3 holds, each of these has bfit + z) < 
b + lg(r)/2 + 3. For all i and i', the event that user i succeeds at step t + z is 
disjoint with the event that user f succeeds at step t + z. 



Lemma 7. Ifn is sufficiently large then Pr(5 < W \ E1AE2AE3AE4) <10 

Proof. If El is satisfied then T2 does not start until there have been at least W 
successes. Since \ > k—k'—Bk" > u[lgr]/2. Corollary^] shows that the 

probability of having fewer than W successes is at most the probability of having 
fewer than W successes in u|"lgr]/2 Bernoulli trials with success probability 
2i/„.e . Since W is at most half of the expected number of successes, a Chernoff 
bound shows that the probability of having fewer than W successes is at most 
exp(-^) < 10-5. 

We conclude Case 2 by observing that p is at least 1 — Pr(lOJ) — Pr(E0) — 
Pr(hEI) — Pr(EEJ — Pr(5 < W | El A E2 A E3 A E4). By Lemmas 0 El S El and 
0 this is at least 1 — 5 x 10-5. 
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3.3 Case 3: 0 < m'{X{t)) < and m{X{t)) < n'®. 

For every state p such that 0 < m'{p) < n '^ and m{p) < n ®, we will define 
k = 32TO'(p)|"lgTO'(p)] + |"n ®]. We will show that, if X{t) = p, then E[f{X{t + 
k) — f{X{t))] < —ek. Once again, we will use m as shorthand for m{X{T)) 
and to' as shorthand for m'{X{t)). Let r = {t, ...,t + /c — 1}, let S be the 
number of successes that the system has in r. Let p denote Pr(5 > 1). As in 
Case 2, E[f{X(t+k) — f{X{t))] < an^'^Xk — an^'^p+kn, and this is at most —ek 
as long as ap > 9. Thus, we will finish by finding a positive lower bound for p 
which is independent of n. 

Since to' > 0, there is a user 7 such that bj{t) < .Sign + 1. Let k' = 
32TO'|"lgTO'] and tq = {t, . . . , t + A:' — 1}. We will now define some events, as in 
Case 2. 

El. There are no arrivals during r. 

E2. Every station with qi{t) > 0 and bi{t) < .Sign + 1 either sends successfully 
during tq or has bi{t + k') > [Igm'] . 

E3. bj{t') < .Sign + 7 for all t' G r. 



Lemma 8. If n is sufficiently large then Pr{El) < 10 
Proof. As in the proof of Lemma |21 



Pr(Id) > 



\ nk 

-] > l-A/c> I-IO-®. 



Lemma 9. Pr{E2) <10 

Proof. We use lemma 5 with i5 = 32 and j = m' to get 

[Ig to'] 



Pr{E3) < to' 



(^')16/ln(2) 



< 10 



-5 



( 8 ) 



Lemma 10. If n is sufficiently large then Pr{E3) <10 

Proof. Let y = 6, and suppose that user 7 sends with backoff b^ = [.Sign + r] 
for r G {1, . . . , 6 }. The probability of this happening is 



Pr(E3) < 

^ / r—1 



< 




r 



^ / 2en-® \ 
- \6n-^2^ ) 
< 10 "®. 



6 
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Lemma 11. Given any fixed sequence of states X(t), . . . ,X(t + z) which does 
not violate El, E2, or E3 such that t + z & t — tq and there are no successes 
during steps [t, . . . ,t + z — 1], the probability that user 7 succeeds at step t + z is 
at least 212 ^^. a ■ 



Proof. The conditions in the statement of the lemma imply the following. 



— qj{t + z) > 0 and bj{t + z) < .Sign + 7. 

— There are no users j with bj {t + z) < [Ig m'~\ . 

— There are at most m' users j with bj{t + z) < .Sign + 1. 

— There are at most m users j with bj {t + z) < Ig /3 + Ig n. 

— There will be no arrivals on step t + z. 

The probability of success for user 7 is at least 



2 -(. 81 gn+ 7 ) 




^ ^ (i) (l) © 

1 

- 





n—m 



Lemma 12. Ifn is sufficiently large then Pr{S < 1 | El A E2 A E3) < e , 



Proof. Lemma implies that the probability of having no successes is at most 
the probability of having no successes in |r — tqI Bernoulli trials, each with 
success probability 2i2^„.8 ■ Since |t — ro| > n ®, this probability is at most 




1 

2i2n-8 



< e 



-1/2^ 



We conclude Case 3 by observing that p is at least 1 — Pr(h[H) — Pr(L0) — 
Pr(hEI) — Pr(5 < 1 | El A E2 A E3). By Lemmas 0 El ^3 and El this is at least 
l-3x 10 -®-e-i/ 2 '" > . 0002 . 
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Appendix : Supplementary Calculations for Case 2 

Here we show the inequality \k— ari^'^W p-\-kn < —ek holds when ap > 2^^ 

and n is sufficiently large. 

Case A: (m> n'^) Since k < 8n|"logm] for large n, 

an^'^Xk — an^'^Wp -b kn 

< av}'^{a n'^)~^k — -b kn 

< {a/a')n'^k — 2^^n^'®m^/^|’logm]2“® -b kn 

< {a/a')n'^{4:{ni -b n) [logm]) — 2®n^ [logm] -b 4(m -b n) [log m]n 

< 8n^'®[logm] — 32n^[logm] -b8n^[logm] 

< — 16n^[logm] < —2nk < —ek. 



Case B: (m < n ®, m' > n-'^) Since k < 12 [n ®] [logm'] for large n, 

an^'^Xk — an^'^Wp -b kn 

< an^'^{an'^)~^k — 2^®n^'®VP -b kn 

< {a/a')n'^k — 2^®n^'® [log m'] 2“® -b kn 

< {a/a')n'^{4{m' -b 2 [n'®]) [logm']) — 2®n^'®[logm'] 
-b4(m-b 2 [n'®]) [log to'] n 

< 12n'®[n'®] [logm']) — 2®n^'® [log to'] -b 12n[n'®] [logm'] 

< 12n ®[n ®] [logm']) — 32n^ ®[logTO'] -b 13n^ ®[logTO'] 

< — 18n^ ®[logTO'] < —nk < —ek. 
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Abstract. The data- broadcast problem consists in finding an infinite 
schedule to broadcast a given set of messages so as to minimize the 
average response time to clients requesting messages, and the cost of the 
broadcast. This is an efficient means of disseminating data to clients, 
designed for environments, such as satellites, cable TV, mobile phones, 
where there is a much larger capacity from the information source to the 
clients than in the reverse direction. 

Previous work concentrated on scheduling indivisible messages. Here, 
we studied a generalization of the model where the messages can be 
preempted. We show that this problem is AP-hard, even in the sim- 
ple setting where the broadcast costs are zero, and give some practical 
2-approximation algorithms for broadcasting messages. We also show 
that preemption can improve the quality of the broadcast by an arbi- 
trary factor. 



1 Introduction 

1.1 Motivation 

Data-broadcast is an efficient means of disseminating data to clients in wire- 
less communication environment, where there is a much larger capacity from 
the information source to the recipients than in the reverse direction, such as 
happens when mobile clients {e.g. car navigation systems) retrieve information 
{e.g. traffic information) from base-station {e.g. the emitter) through a wireless 
medium. In a broadcasting protocol, items are broadcast according to an infinite 
horizon schedule and clients do not explicity send a request for an item to the 
server, but connect to the broadcast channels (shared by all the clients) and 
wait until the requested item is broadcast. These system are therefore known as 
pseudo-interactive or push-hased: the server “pushes” the items, or messages, to 
the clients (even if disconnected) according to a schedule which is oblivious to 
the effective requests; as opposed to the “traditional” pull-based model, where 
the clients send a request to “pull” the required item from the server when they 
need it. The quality of the broadcast schedule is measured by the expected ser- 
vice time of the addressed requests. Furthermore, as each message has a cost for 
broadcasting {e.g. a weather broadcast and a news broadcast may have differ- 
ent costs for the emitter), the server also tries to minimize the resulting cost of 
service. The server has then to minimize the expected service response time of 
the requests (quality of service) and the broadcast cost of the resulting schedule 
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(cost of service). The server designs the broadcast schedule from the profile of 
the users: given the messages Mi, . . . ,Mm, the profile consists of the popular- 
ities of the different messages, that is to say the probabilities that 

Message Mi is requested by a random user. proposes some techniques to 
gauge user profiles in push-based environment. 

With the impressive growth of the wireless, satellite and cable network, the 
data dissemination protocols have a number applications in research and com- 
mercial frameworks. One of the earliest applications was the Boston Community 
Information System (BCIS, 1982) developed at the MIT to deliver news and 
information to clients equipped personally with radio receivers in metropolitan 
Boston. It was also introduced in early 1980’s in the context of Teletext and 
Videotex BOI It is now used by applications that require dissemination among 
a huge number of clients. The Advanced Traffic Information System (ATIS) HI, 
which provides traffic and route planning information to cars specially equipped 
with computers, may have to serve over 100,000 clients in a large metropolitan 
city during the rush hours. The news delivery systems on the Internet, such as 
PointCast inc. (1997), or Airmedia inc. (1997), require efficient information dis- 
semination system. A comparison of the push-based system to the traditional 
pull-based approach for those problems can be found in fp. 

Note that the data-broadcast problem also models the maintenance schedul- 
ing problem and the multi-item replenishment problem UEM 

While previous work made the assumption that messages transmission cannot 
be preempted, we focus in this paper on the case where the messages do not have 
uniform transmission times and can be split. 



1.2 Background 

Since the early 1980’s, many authors |Bl 01 ^ E] lEl have studied the data- 
broadcast problem in the restrictive setting where all messages have the same 
length, the broadcast is done on a single channel, and time is discrete (this 
restricted problem is also known as Broadcast disks problem or Dissemination- 
based systems) . In particular, Ammar and Wong [31 ^ give an algebraic expres- 
sion of the expected service time of periodic schedules, provide a lower bound, 
and prove the existence of an optimal schedule which is periodic. Our Lem- 
mas 0 Eland Proposition 0 are generalizations of these results to our setting. 
Bar-Noy, Bhatia, Naor and Schieber 0 prove that the problem with broad- 
cast costs is AP-hard, and after a sequence of papers giving constant factor 
approximations iniini, Kenyon, Schabanel and Young El design a PTAS for the 
problem. The papers 0 0 El El El Ej study related questions pertaining to 
prefetching, to caching and to indexing. 

As can be seen from the example of broadcasting weather and news reports, 
in many applications, it does not make sense to assume that all messages have 
the same transmission time; thus a couple of recent papers have explored the 
case of non-uniform transmission times. In PI Vaidya and Hameed report some 
experimental results for heuristics on one or two channels. In m Kenyon and 
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Schabanel show that the case where the messages do not have the same trans- 
mission time, the data-broadcast problem is iVP-hard, even if message have zero 
broadcast cost, and does not always admit an periodic optimal schedule. They 
show that the natural extension of the lower bound given in is arbitrarily 
far from the optimal when the messages have very different length. The main 
difficulty is due to the fact that, while a long message is being broadcast, all 
requests for shorter and more popular messages have to be put on hold. But 
in that case, it seems reasonable to allow a occasional interruption of a long 
“boring” message transmission so as to broadcast a short popular message. In 
other word, one should allow preemption. This is the main motivation to the 
preemptive model introduced and studied in this paper. 



1.3 Our Contribution 

This paper introduces and studies the model where the messages to be broad- 
cast have non uniform transmission time and where their transmission can be 
preempted. One of the most interesting contribution from the practical point 
of view is that our algorithms (Section E|) generate preemptive schedules whose 
costs can be arbitrarily smaller than the optimal cost of any non-preemptive 
schedule on some inputs (See Note [Din Section EJ. Thus there is an infinite gap 
between the preemptive and non-preemptive problem. 

We adopt the following model. The input consists of m messages Mi, . . . , Mm 
and an user profile determined by the probabilities (pi)i^i^m that a user requests 

Message Mi {p\ + \-Pm = !)■ Each message Mi, i = l..m, is composed of 

£i packets with transmission time 1 and each broadcast of a packet costs Ci ^ 0. 
The packets of the messages are broadcast over W identical and synchronized 
broadcast channels split into time slots of length 1 (time slot t is the period 
of time [t — Given a schedule S of the packets into the slots, over the 

W channels, a client requesting Message Mi, starts monitoring all the channels 
at some (continuous) point, downloads the different packets of Mi one at a 
time when they are broadcast on some channel, and is served as soon as it has 
downloaded all the £i packets of Message Mi. The order in which the client has 
received the packet of Mi is irrelevant, as in TCP/IP. 

The problem is to design a sequence S to schedule the packets over time, so 
as to minimize the sum of the expected service time of Poisson requests and of 
the average broadcast cost, i.e. so as to minimize limsup^-^o^ (EST(5, [0,T]) -I- 
BC(5, [0, r])); here, EST(S', [0, T]) denotes the expected service time of a request 
which is generated at a random uniform (continuous) instant between 0 and T, 
requests Message Mi with probability pi , and must wait until the £i packets of Mi 
have been broadcast and downloaded; and BC(5, [0, T]) is the average broadcast 
cost of the packets whose broadcast starts between 0 and T. Note that this 
definition agrees with the one in the literature {e.g. 0), in the uniform- length 
case where the messages are composed of a single packet. 

The results presented in this paper are obtained thanks to the simple but 
crucial observation made in Lemma E for all i, an optimal schedule broadcasts 
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the packets of Message Mi in Round Robin order. We can thus restrict our search 
to Round Robin schedules. From this observation, we get an tractable algebraic 
expression for the cost of such a schedule in Lemma |2 from which we derive 
the lower bound in Lemma 0 This lower bound is the key to the two main 
results of the papers: 1) the problem is strongly fVP-hard, even if no broadcast 
cost are assumed, in Theorem Q] (note that the fVP-hardness proof given in ^ 
for the uniform length case requires non-zero broadcast cost); 2) there exists 
polynomial algorithm which constructs a periodic schedule with cost at most 
twice the optimal, in Section ^ 

The lower bound also reveals some important structural differences between 
our model and the previous models. First, surprisingly, as opposed to all the 
previous studies, the lower bound cannot be realized by scheduling the packets 
regularly but by gluing them together (see Lemma 0): from the individual point 
of view of a request for a given message, the message should not be preempted. 
This allows to derive some results from the non-preemptive case studied in m- 
But, whereas non-preemptive strategies cannot approach this lower bound, we 
obtain, all the same, efficient approximation scheme within a factor of 2 by 
broadcasting the packets of each message regularly. Second, although the lower 
bound specializes to the one designed in when all messages are composed 
of a single packet, deriving the lower bound is no longer a straight forward 
relaxation on the constraints on the schedule and requires a finer study of the 
“ideal” schedules. Moreover, its objective function is no longer convex and its 
resolution (in particular the unicity of its solution) needs a careful adaptation, 
presented Section of the methods introduced in PEHj. 

Note that our preemptive setting models also the case where users do request 
single messages but batches of messages. We can indeed consider the packets of 
a message as messages of a batch. The preemptive case studied here is the case 
where the batches are all disjoint. In that sense the paper is an extension of some 
results in (2|. 



1.4 The Cost Fhnction 

We are interested in minimizing the cost of the schedule S, which is a com- 
bination of two quantities on S. The first one, denoted by EST(S'), is the ex- 
pected service time of a random request (where the average is taken over the 
moments when requests occur, and the type Mi of message requested). If we 
define by EST(S', J), the expected service time of a random request arrived in 
time interval /, EST(S') is: EST(S') = limsup-r^go EST(S', [To,ir])) for any Tq. 
If we denote by ST{S,Mi,t), the service time of a request for Mi arrived at 
time t, and by EST(S', M^,/) the expected service time of a request for Mi ar- 
riving in time interval J, we get: EST(S', Mi,J) = ST(S', M^, f) dt, and 
EST(5, 1) = J2ti EST(^, I). 

The second quantity is the broadcast cost BC(S') of the messages, defined 
as the asymptotic value of the broadcast cost BC(S', /) over a time interval I: 
BC(S') = limsup-p^j,^ BC(«S', [To, r]), for any Tq. By definition, each broadcast 
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of a packet of Mi costs c^. For a time interval /, BC(S', I) is the sum of the cost 
of all the packets whose broadcast begins in /, divided by the length of I. The 
quantity which we want to minimize is then: COST(S') = EST(5') + BC(S'). Note 
that up to scaling the costs Ci, any linear combinaison of EST and BC can be 
considered. 

2 Preliminary Results 

2.1 Structural Properties 

The following lemma is a crucial observation that will allow to deal with the 
dependencies in a tractable way. From this observation, we derive an algebraic 
expression for the cost of periodic schedule. In the next section, we show that 
this expression yields to a lower bound on the cost of any schedule. The lower 
bound will be used in Section 0 to design efficient approximation algorithm. 

Definition 1. A schedule S is said Round Robin if at most one paeket of each 
message Mi is broadeast in any time slot aceording to S, and if S schedules the 
packets of each message in Round Robin order (i.e. according to a cyclic order). 



Lemma 1 (Round Robin). For any schedule S, consider the Round Robin 
schedule S' constructed from S by rescheduling in Round Robin order the packets 
of each message Mi within the slots reserved in S to broadcasting a packet of Mi. 
Then: COST(S") ^ COST(S'). 

Moreover, if S is periodic and is not Round Robin, then S' is periodic and: 
COST(S") < COST(S'). 

Proof. First, S and S' have the same broadcast cost. Second, consider a request 
for Mi arriving at time t in S, and the £i first time slots where a packet of Mi 
is broadcast in S after time t. The service time of the request is minimized iff 
the £i packets of Mi are broadcast in those slots. Thus the expected service time 
in S' is at most as large as in S. Moreover, if S is periodic with period T and is 
not Round Robin, then S' is periodic with period ^ T YYlfi its expected 

service time is smaller than S's,. □ 



W.l.o.g. we will now only consider Round Robin schedules. 



Lemma 2 (Cost). Consider a periodic schedule S with period T. For each i, 
Ui is the number of broadcasts of message Mi in a period, and the 

time elapsed between the beginnings of the and the (j + 1)*^ broadcasts of a 
packet of Message Mi. Then: 



EST(5) + 



ft fft 



i=l 3=1 



T 2 






i+i 



and BC(S') = — CiUiii, where the indices 



i=l 





are eonsidered modulo Ui£i. 
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Proof. Consider i in {1, . . . ,m}. Message Mi is broadcast rii times per period, 
its contribution to the broadcast cost is then rii£iCi/T. A request is for Message 
Mi with probability pi and arrives between the j**' and the (j + 1)**' broadcasts 
of a packet of Mi with probability t\/T. It starts then downloading the first 
packet after t*/2 time on expectation and ends downloading the last packet 
after H h + 1 other time slots. 



Remark 1 (Trapezoids rep- 
resentation). Note that we 
can represent the cumulated 
response time to request for 
a given message over a pe- 
riod of time by the sum of 
the areas of trapezoids as 
shown Figure ^ the black 
arrows are two example of 
requests, their waits are high- 
light in black, and the extra 
cost for downloading the 
last packet is in grey. 
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Fig. 1 : The expected service time. 



2.2 Optimality Results 

Theorem 1 (NP-Hardness). Finding the optimal schedule is strongly NP- 
hard on a single channel and with zero cost messages. 

Proof sketch. (Omitted) The proof is derived from the AP-hardness proof of 
the non-preemptive case given in HE!!: we show that deciding whether the lower 
bound in Lemma 0 is realized is at least as hard as A-partition. □ 

Remark 2. Note that yields an other AP-hardness proof by stating that the 
uniform length case with non zero cost is already AP-hard; however the present 
proof does not use costs. 



Proposition 1 (Optimal periodic). There exists an optimal schedule which 
is periodic. R can he computed in exponential time. 

Proof sketch. (Omitted) The proof is based on the search of a minimum cost cycle 
in a finite graph, and the lemmas are broadly inspired from giiniini but their 
proofs need to be widely adapted in order to take into account the segmentation 
of the messages into packets. □ 



3 A Lower Bound 

Finding a good lower bound is a key point to designing and proving efficient 
approximation algorithms for this problem. An algorithm to compute the value 
of the following lower bound, will be given Section 14.61 
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Lemma 3 (Lower bound). The following minimization problem is a lower 
bound to the cost of any schedule of the packets of Mi, . . . , Mm on W channels: 



LB(M) 



min > 

r>0 



Pi 



Tiii ii — 1 

—+e^ - - — 

2 2n 



r,; 



Subject to: (i) Vz, n ^ 1 and (zz) ^ ^ IP 



i=l 



This minimization problem admits a unique solution t* . LB(M) is realized if 
and only if one can broadcast all the packets of each Mi consecutively periodically 
exactly every (t* • ii). 



Proof sketch. According to Lemma ^ 1st 5 be a periodic Round Robin sched- 
ule of the packets of messages Mi, . . . ,Mm on W channels with period T. We 
use the same notations (rzi) and (tj) as in Lemma 0 Given that Message Mi is 
broadcast Ui times per period, we seek for the optimal value of the (t*) for each 
message independently. We relax the constraints on the schedule by authorizing 
messages to overlap and to be scheduled outside the slots. The proof works in 
three steps: 

T If the expected service time for Mi with ^ 2 is minimized, then for any pair 
of consecutive broadcasts of the same packet of Mi at time t\ and t 2 (ti < ^ 2 ), 
a packet of Mi is broadcast at time fti + 1) or {t^ — 1). 

2. If the expected service time of Mi is minimized, the packets of Mi are broad- 
cast within blocks of ii consecutive time slots. 

3. The blocks are optimally scheduled periodically every T/ui. 

Step 1. Consider Mi with ii ^ 2 and two consecutive packets of Mi (w.l.o.g. 
packets 1 and 2). For 1 ^ A: ^ rzj, let 1^, Jk, and Kk be the intervals delimited 
by the end of the broadcast of packet 1, the beginning and the end of the 
A-th broadcast of packet 2, and the beginning of the next broadcast of a packet 
of Mi as illustrated below (Note that \Jk \ = 1). 

(s) — 02; § — g g — t 

Ik JkKk A+iA+1 

Let S' be the schedule that schedules the packets of Mi as in S except that 
packet 2 is always scheduled next to packet 1 . A request for Mi that raises outside 
intervals Ik, Jk and Kk has the same service time in S and in S''. A request that 
raises in Ik is served one time unit later in S' than in S. But a request that 
raises in Jk U Kk is served |/fc+i| earlier in S' than in S. The expected service 
time varies then from S to S' by: 

rii n-i 

^(141 X 1 - (1 + \Kk\) X |7fc+i|) = -Y^\Kk\ ■ |/fc+i| ^ 0 

i—\ i—1 

Thus, the expected service time in S' is at most as big as in S and smaller if 
there exists in S a pair of consecutive broadcasts of packet 2 occuring at time t\ 
and t 2 (ti < t 2 ) so that no packet of Mi is broadcast at time (Ai -I- 1) {\Kk\ yf 0) 
and {t 2 - 1) (|7fc+i| ^ 0). 



188 



Nicolas Schabanel 



Step 2 is obtained by contradiction using the transformation in Step 1. 

Step 3. We are thus left with rii blocks of li packets of Mi. Let tk be the time 
elapsed between the beginning of the and the (/c+ block. LemmaOyields 
that the expected service time for Mi is: 

f I f ~ 1) 

* 2T 

fc=i ^ ^ 

which is minimized under the constraint X)fc=i = T, when for all k, tk = T jui. 
Define r, = T/{ni£i). The cost of S is thus bounded from below by: 

COST(S)s|;{p,(f + 

Finally riiii ^ T and ^ imply: {i) ^ 1 and [ii) ^ 

Minimizing over those constraints yields the lower bound on the cost of any 
schedule. The unicity of the solution r* to the minimization problem will be 
proved Section lO 

Moreover by construction, the lower bound is realized iff there exists a periodic 
Round Robin schedule that broadcast the £i packets of each Mi in consecutive 
slots, periodically every r*£i. □ 

Remark 3. One can derive a trivial lower bound close up to an additive term 
J2^iPi£i to ours by simply optimizing the time needed to download a given 
packet for each message. If this later lower bound is sufficient to analyze our 
heurisitics, it is never realized and cannot be used to yield our iVP-hardness 
result. 



4 Constant Factor Approximation Algorithms 



Note 1. The optimal ficticious schedule suggested by the lower bound LB(M) is 
not realizable in general. Actually, as shown in ^Dj, if no preemption are used, 
the optimal cost of a schedule can be arbitrary far from the lower bound LB(M). 
Consider the problem of scheduling W+1 messages Mi, . . . , Mw+i on W chan- 
nels, where Mi counts £i = packets, cost Ci = 0, and request probabil- 

ity Pi = aj L'‘~^ , where a is such that pi-l-- • Mpm = 1- In that case, one can show 
by induction on W that when L goes to infinity, the optimal schedule without 
preemption has a cost OPTwhithout preemption = ), but LB(M) = 0(1). 



In order to minimize the cost of the 
schedule, we won’t follow exactly the ficti- 
cious schedule suggested by the lower bound 
in LemmaOl In fact, remark that if we spread 
regularly the packets of each message Mi, 
every Ti, in this ficticious schedule, the ex- 
pected service time to a random request in- 
creases by less than a factor of 2. This will be 



J 



Ti 

Fig. 2: 

larly. 



Spreading the packets regu- 



helpful in order to design an efficient approximation algorithm for the preemptive 



case. 
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Algorithm 1 Randomized algorithm 
Input: Some positive numbers Ti . ,Tm, 
verifying Y.T=i < 1- 
Let To > 0 so that: 

1/^0 = 1 - E ™ 1 

Output: 

for t = 1..00 do 

Draw i G {0,1,... , m} with prob- 
ability 1/ri. Schedule during slot t, 
the next packet of Message Mi in the 
Round Robin order, if i ^ 1; and Idle 
during slot t, otherwise. 



Algorithm 2 Greedy algorithm 
Input: Some positive numbers Ti ,.. . ,Tm, 
verifying EIli ^ 1- 
Let Co = Po = 0 and ro > 0 so that: 

iAo = i-E™i iM 

Output: 

for t = 1..00 do 

Select i £ {0, 1, . . . , m} which mini- 
mizes (Ci - piTi E/=i )• 

Schedule during slot t, the next packet 
of Mi in the Round Robin order, if i ^ 
1; and Idle during slot t, otherwise. 



We will first present algorithms that construct efficient schedules on a single 
channel in Sections 11.11 14.21 and 14. dl then Section h.U shows how to extend these 
algorithms to the multichannel case, using a result of 



4.1 A Randomized Algorithm 



Theorem 2. Given m messages Mi,... ,Mm, the expected cost of the one- 
channel schedule S generated by the randomized algorithm 0 is: 



^ m 

E[COST(5)] = ^ + E 



PiTdi + — 



Thus if T = T* realizes LB(M); E[COST(S')] ^ 2 • LB(M) — 3/2. 



Proof. A packet of Mi is broadcast with probability l/r^ in S. The expected 
frequency of Mi is then l/n and E[BC(S')] = EHi Ci/R. A request for Mi is 
served after ^i downloads of a packet of Mp. it waits on expectation 1/2 until 
the end of the current time-slot and Ti^i upto the end of the download of the 
last packet of Mi. Then, E[EST(S')] = 1/2-1- 

Finally r* ^ 1 and ^i ^ 1 imply: 2LB(M) ^ Ci/r*) -\- 2, which 

yields the last statement. 



4.2 A Greedy Approximation 

We present in this section a derandomized version of the randomized algorithm 
above. 

As shown Figure 0, we define = 4 packets 

the state of the schedule at time W ^ ^ 1 W Slot t ^ 

slot t as a vector s*, such that: 0 ^ H @ ^ 

for any i and 1 ^ j ^ £i, the ^ ^ t ^ 

of the ii last broadcasts of a 

packet of M, before time t starts at 3: The state (s,,,) at time t. 

time (t — {s* j Si/,). Since no request arrive before t = 0, we equivalently 

assume that all the packets of all messages are Actively broadcast at time t = 0, 
and initially, at time t = 0: for all i and j, Sij = 0;. 
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Theorem 3. Given m messages M\, . . . , Mm, the cost of the one-channel sched- 
ule S generated by the greedy algorithmic is: 

1 ™ / \ 

COST(^) ^ ^ P^Td^ + - 

^ i=i ^ 

Thus if T = T* realizes LB(M); COST(S') ^ 2 • LB(M) — 3/2. 

Proof sketch. (Omitted) The greedy algorithm is a derandomized version of the 
algorithm above. The greedy choice ensures that at any time t, the choice made in 
time slot t minimizes the expected cost of the already allocated slots 1, . . . , t— 1, if 
the schedule would continue with the randomized scheme. Its cost is then, at any 
time, bounded from above by the expected cost of the randomized schedule. □ 



4.3 A Deterministic Periodic Approximation 

It is sometimes required to have a fixed schedule instead of generating it on the 
fly. For instance, it helps to design caching strategies Q. The next result shows 
that one can construct an efficient periodic schedule with polynomial period. 
Note that this allow also to guarantee a bound (the period) on the service time 
of any request. 



Theorem 4. One can construct in polynomial time, a periodic schedule with 
cost ^ 2 • LB(M) and period polynomial in the total length and cost of the 



messages ii)'^ + 2 X)™ i 



All the packets 



All the packets in order 




t = 0 



kTi 








IP 









T steps of Greedy Algorithm 



Junction for 
Round Robin 



Proof sketch. (Omitted) The 
schedule is constructed as 

shown Figure II 1) First, ^ [ 

schedule all the packets of 
each message during the first 
-C =def time slots; 

2) Second, executes T steps ^ig. 4: A periodic approximation, 

of the greedy algorithm above; 3) Third, sort the set X = {kr* : 1 ^ k ^ £i} in 
increasing order and schedule during the next C time slots, the k^^ packets of the 
messages Mi in order of increasing kr*; 4) Finally, complete with some packets 
of the messages in order to ensure that for all i, the number of broadcasts of 
a packet of Mi in a period is a multiple of £i, and thus guaranty the Round 
Robin property. One can show that the cost of the resulting schedule is at most 
2LB(M) as soon as the period is bigger than 



4.4 Multi-channel 2- Approximations 

The performance ratio proof for the randomized algorithm given above only rely 
on the fact that we know how to broadcast the packets of each Mi every Ti on 
expectation. In order to extend the result to the multi-channel case, we only 
need to manage to broadcast the packets of each Mi with probability l/r^, while 
ensuring that two packets of the same messages are not broadcast during the 
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same time slot. A straight forward application of the method designed in |^, 
to extend the single channel randomized algorithm to the multi-channel, yields 
then the result. 

The multi-channel greedy algorithm is again obtained by derandomizing the 
schedule, and by extending the greedy choice as in 0. Finally the extension of 
the periodic approximation is then constructed exactly as in Section 14.111 except 
that one uses the multi-channel greedy algorithm instead of the single channel 
one. 

4.5 Solving the Lower Bound 

The aim of this last section is to solve the following generic non-linear pro- 
gram (A), defined by: 



where IT is a positive integer, oi, . . . , am are positive numbers, and 6i, . . . ,bm 
are arbitrary numbers. 

We present essentially an extension of the method designed in jOj for the 
special case where for all i, bi ^ 0. The results presented are basically the same 
but the proofs need to be adapted. As in jHli we introduce a relaxed minimization 
problem {A'), which do not require the constraint (i), and which can be solved 
algebraically. The solution to the relaxed problem will allow to construct and 
prove the unicity of the solution to (A) . 

Lemma 4 (Relaxation). Given some positive numbers ai, . . . ,Om, a positive 
integer W and some numbers bi, . ■ ■ , bm, the following minimization problem: 



admits a unique solution t' verifying: r' = {bi + \')/ai, for a certain A' ^ 0. 
If for all i, bi ^ 0 and \J dilbi ^ W, then X = 0; else A' is the unique 
solution to: \/o-il{bi + A') = W. 

Proof sketch. (Omitted) Solved by carefull use of Lagrangian relaxation. □ 

Lemma 5. Consider the two non-linear minimization problems (A) and (A'), 
a solution r* to (A) and the solution t' to (A'). Then, for all i, if t[ < 1, 
then T* = 1. 

Proof. The proof given in jO] is only based on the unimodularity (and not on 
the convexity) of the terms a^Ti -I- bi/ri. Their proof then naturally extends to 
the case where some bi may be negative. □ 

Corollary 1 (Unicity). The minimization problem (A) admits a unique solu- 
tion T* which can be computed in polynomial time. 



m 



h. 



m 
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Proof. Consider a solution t* to (^). We compute the solution r' to {A'). If for 
some to, < 1, then = 1. Thus, we remove this variable from Problem (^) 
by fixing its value to 1, and iterate. If for all i, r' ^ 1, r' is also solution of (A), 
which is thus unique: r* = t' . □ 

Acknowledgment. We’d like to thank Neal E. Young and Claire Kenyon, for 
useful comments and careful reading of the paper. 

The full version of the paper is available at //www. ens-lyon.fr/~nschaban. 
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Abstract. Several recent papers have shown how to approximate the 
difference \ai — 6i| or ^ \ai — bi\^ between two functions, when the 
function values ai and bi are given in a data stream, and their order is 
chosen by an adversary. These algorithms use little space (much less than 
would be needed to store the entire stream) and little time to process 
each item in the stream and give approximations with small relative 
error. Using different techniques, we show how to approximate the L^- 
difference for any rational-valued p £ (0, 2], with comparable 

efficiency and error. We also show how to approximate \ai — for 
larger values of p but with a worse error guarantee. These results can be 
used to assess the difference between two chronologically or physically 
separated massive data sets, making one quick pass over each data set, 
without buffering the data or requiring the data source to pause. 



1 Introduction 

[Some of the following material is excerpted from f], with the authors’ permis- 
sion. Readers familiar with jj| may skip to Section ll .1 1 ] 

Massive data sets are increasingly important in a wide range of applica- 
tions, including observational sciences, product marketing, and monitoring and 
operations of large systems. In network operations, raw data typically arrive in 
streams, and decisions must be made by algorithms that make one pass over each 
stream, throw much of the raw data away, and produce “synopses” or “sketches” 
for further processing. Moreover, network-generated massive data sets are often 
distributed: Several different, physically separated network elements may receive 
or generate data streams that, together, comprise one logical data set. To be 
of use in operations, the streams must be analyzed locally and their synopses 

* Part of this work was done while the first author was visiting AT&T Labs. 

** An expanded version of this paper is available in preprint form at 
http : //www. research, att . com/~mstrauss/pubs/lp .ps 

H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. Ifl.S- imi 2000. 

(c) Springer- Verlag Berlin Heidelberg 2000 
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sent to a central operations facility. The enormous scale, distributed nature, and 
one-pass processing requirement on the data sets of interest must be addressed 
with new algorithmic techniques. 

In [211 11117) . the authors presented a new technique: a space-efficient, one- 

pass algorithm for approximating the difference ^ ■ |ci — 6i| or differenc^ 
1/2 

(E. \0‘i — between two functions, when the function values ai and bi are 

given as data streams, and their order is chosen by an adversary. Here we con- 
tinue that work by showing how to compute \ai — bi\P for any rational- valued 
p G (0,2]. These algorithms fit naturally into a toolkit for Internet-traffic mon- 
itoring. For example, Cisco routers can now be instrumented with the NetFlow 
feature 0. As packets travel throMh the router, the NetFlow software produces 



summary statistics on each flowa Three of the fields in the flow records are 
source IP-address, destination IP-address, and total number of bytes of data 
in the flow. At the end of a day (or a week, or an hour, depending on what 
the appropriate monitoring interval is and how much local storage is available), 
the router (or, more accurately, a computer that has been “hooked up” to the 
router for monitoring purposes) can assemble a set of values (x,ft(x)), where 
a; is a source-destination pair, and ft{x) is the total number of bytes sent from 
the source to the destination during a time interval t. The difference between 
two such functions assembled during different intervals or at different routers is 
a good indication of the extent to which traffic patterns differ. 

Our algorithm allows the routers and a central control and storage facility to 
compute LP differences efficiently under a variety of constraints. First, a router 
may want the difference between ft and ft+i- The router can store a small 
“sketch” of ft, throw out all other information about ft, and still be able to 
approximate ||/t — /t+i||p from the sketch of ft and (a sketch of) ft+i- 
(i) 

The functions ft assembled at each of several remote routers Ri at time 
t may be sent to a central tape-storage facility C. As the data are written to 
tape, C may want to compute the difference between fj:^'^ and but this 
computation presents several challenges. First, each router Ri should transmit 
its statistical data when Rfs load is low and the Ri~C paths have extra capacity; 
therefore, the data may arrive at C from the Rfs in an arbitrarily interleaved 
manner. Also, typically the x’s for which f{x) yf 0 constitute a small fraction 
of all x’s; thus, Rt should only transmit {x,fj:^\x)) when ft^\x) yf 0. The 
set of transmitted x’s is not predictable by C. Finally, because of the huge 



^ Approximating the difference, ||(ai) — (^>i)|ip = (X) is computation- 

ally equivalent to approximating the easier-to-read expression ^ |ai — bi\P. We will 
use these interchangeably when discussing computational issues. 

^ Roughly speaking, a “flow” is a semantically coherent sequence of packets sent by 
the source and reassembled and interpreted at the destination. Any precise definition 
of “flow” would have to depend on the application (s) that the source and destination 
processes were using to produce and interpret the packets. From the router’s point of 
view, a flow is just a set of packets with the same source and destination IP-addresses 
whose arrival times at the routers are close enough, for a tunable definition of “close.” 
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size of these streams 0 the central facility will not want to buffer them in the 
course of writing them to tape (and cannot read from one part of the tape while 
writing to another), and telling Ri to pause is not always possible. Nevertheless, 
our algorithm supports approximating the difference between and 
at C, because it requires little workspace, requires little time to process each 
incoming item, and can process in one pass all the values of both functions 
{(a;, f^^\x))} U {(a;, ft‘^\x))} in any permutation. 

Our L^’-difference algorithm achieves the following performance for rational 

pe (0,2]: 

Consider two data streams of length at most n, each representing 
the non-zero points on the graph of an integer-valued function on a 
domain of size n. Assume that the maximum value of either function on 
this domain is M . Then a one-pass streaming algorithm can compute 
with probability 1 — 5 an approximation A to the L^-difference B of the 
two functions, such that \A — B\ < eB, using total space and per-item 
processing time (log(M) log(n) log(l/<5)/e)‘^^^^. The input streams may 
be interleaved in an arbitrary (adversarial) order. 

1.1 Z/P-Differences for p Other than 1 or 2 

Our results fill in gaps left by recent work. While the L^- and L^- differences 
are important, the L^-differences for other p, say the L^'^-difference, provide 
additional information. In particular, there are (a^), (bi), (o'), and (6') such that 
X) |oi -bi\ = J2 K - b'i\ and X] |a* - = J2K~ but X] |oi - and 

^ ja' — 6']^® are different. 

By showing how to compute the difference for varing p, we provide an 
approximate difference algorithm that is precisely tunable for the application at 
hand. 

We also give an algorithm for p > 2, though with an error guarantee some- 
what worse than the guarantee available for the p <2 cases. Still, that result is 
a randomized algorithm with the correct mean, which is an advantage in some 
situtations. 



1.2 Organization 

The rest of this paper is organized as follows. In Section El we describe precisely 
our model of computation and its complexity measure. We present our main 
technical results in Section 01 We discuss the relationship of our algorithm to 
other recent work and present some open problems, in Section^ In this extended 
abstract, all proofs are omitted. 

® In 1999, a WorldNet gateway router generated more that 10Gb of NetFlow data each 
day. 
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2 Background 

We describe the details of our algorithm in terms of the streaming model used 
in 0. This model is closely related to that of nm. It is immediate to adapt our 
algorithm to the sketch model used in we give only brief comments. 

2.1 The Streaming Model 

A data stream is a sequence of data items cti, (72, . • . , Cn such that, on each pass 
through the stream, the items are read once in increasing order of their indices. 
We assume the items <7i come from a set of size M, so that each ai has size 
log M . In the computational model, we assume that the input is one or more 
data streams. We focus on two resources — the workspace required in words and 
the time to process an item in the stream, but disregard pre- and post-processing 
time. 

Definition 1. The complexity class PASST{s{6,e,n, M),t{6,e,n, M)) (read as 
“probably approximately correct streaming space complexity s{5, e, n, M) and time 
complexity t{5,e,n, M)”) contains those functions f for which one can output a 
random variable X such that \X — f \ < ef with probability at least 1 — (5 and 
computation of X can be done by making a single pass over the data, using 
workspace at most s{S,e,n, M) and taking time at most t(S, e,n, M) to process 
each of the n items, each of which is in the range 0 to M — 1. 

If s = t, we also write PASST(s) for PASST(s,t). 

2.2 The Sketch Model 

Sketches were used in ^ to check whether two documents are nearly duplicates. 
A sketch can also be regarded as a synopsis data structure |0|. 

Definition 2. The complexity class PAS(s(<5, e, n, M))) (to be read as “probably 
approximately correct sketch complexity s{6, e,n, M)”) contains those functions 
f \ X X X ^ Z of two inputs for which there exists a set S of size 2®, a ran- 
domized sketch function h : X S , and a randomized reconstruction function 
p : S X S ^ Z such that, for all X\,X 2 G X, with probability at least 1 — 5, 
\p{h{xi),h{x 2 )) - /(a;i,a; 2 )| < ef{xi,X 2 ). 

By “randomized function” of k inputs, we mean a function of fc-l- 1 variables. 
The first input is distinguished as the source of randomness. It is not necessary 
that, for all settings of the last k inputs, for most settings of the first input, the 
function outputs the same value. 

2.3 Medians and Means of Unbiased Estimators 

We now recall a general technique of randomized approximation schemes. 

Lemma 1. Let X be a real-valued random variable such that, for some c, we 
have E[A^] < c • var[A]. Then, for any e, <5 > 0, there exists a random variable 
Z such that Pr(|Z — E[A]| > eE[X]) < 6. Furthermore, Z is a function of 
0(log(l/(5)/e^) independent samples of X. 
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3 The Algorithm 

In this section we prove our main theorem: 

Theorem 1. For rational p S (0,2], the -difference of two functions (oi) and 
{hi) is in 

PASST((log(n)log(M)log(l/<5)/e)0(i)) , (1) 

when the stream items consist of values ai or hi, presented in arbitrary order. 
This LP -difference is also in 

PAs((log(n)log(M)log(l/,5)/e)0(i)) . (2) 



3.1 Intuition 



We first give an intuitive overview of the algorithm. Our goal is to approximate 
Lp = ^ \ai — bi\P, where the values ai, bi < M are presented in a stream in 
any order, and the index i runs up to n. We are given tolerance e and maximum 
error probability 5. We wish to output a random variable Z such that Ft{\Z — 
Lp\ > eLp) < S, using total space and per-item processing time polynomial in 
(log(n) log(M) log(l/i5)/e). The input is a stream consisting of tuples of the form 
{i,c,9), where i G [0,n), c G [0,M), and 0 G {±1}- The tuple (i,c,9) denotes 
that ai = c ii 9 — +1 and bi = c ii 9 = —1. 

Below, the reader may consider / to be a deterministic function with (/(6) — 
/(a))^ = \b — a|^’. (In the next few sections, we will construct a randomized 
function, f{r,x), such that E [{f{r,b) — f{r, a))"^] ^ \b — a\P.) The algorithm 
proceeds as in Figure ^ 

To see how the algorithm works, first focus on single values for k and i. Note 
that Z = ~ Separating the diagonal and off-diagonal 

terms of Z'^, 



E [^2 



= E 



E 



- f{bi)f + ^±cr*cri-(/(ai) - f{bi)){f{ae) - /(&*')) 






\ai - 6,|P + ^±cTiCri/(/(a*) - f{bi)){f{a,>) - f{bi,)) 

i 






( 3 ) 

( 4 ) 

( 5 ) 

( 6 ) 



In the last line, we used the fact that Effi] = 0 and that cri and ai> are indepen- 
dent for i i' . A similar calculation shows that var(Z^) < 0(E^lZ^]). We can 
therefore apply Lemma E and take a median of means of independent copies of 
Z^ to get the desired result. 

It is straightforward to check that cost bounds are met. The analysis is omit- 
ted in this extended abstract. 
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Algorithm Ll({(i, c, 0))) 



Initialize : 



For fc = 1 to 0(log(l/(5)) do 
For £ = 1 to 0(l/e^) do 
Zk,i = 0 ; 

pick sample points for random variable 
families {ui} Euid {Jii}; 

I loi = ±1 and JJi is described below 



Stream processing: 

For each tuple (i, c, Q') in the input stream do 
For fc = 1 to 0(log(l/(5)) do 
For f = 1 to 0(l/e^) do 
+= cTief{Ri,c); 



Report : 



Output medianfc avg^ 



Fig. 1. Main algorithm, intuition 



3.2 Construction of /, Overview 



Construction of / is the main technical content of this paper. We construct a 
function / : Z ^ Z such that 



E 



{f{b)-f{a)f ={l±e)\b- 



( 7 ) 



To do this, we will first define a function d{a, b) such that 
~ |d(a, 6)1 e 0(|6 — for all a and 6, 

— |d(a, 6)1 G 17(|6 — o|P/^) for a significant fraction of a and 6, and 

— d(r, 6) — d{r, a) = d(a, b) for all r. 

Next, we define a family {Tr} of transformations on the reals, with correspond- 
ing inverse scale factors 4>{R) such that: 

— the transformation is an approximate isometry, i.e., |o — 6|^ « 4>^{R)\Tji{b) — 
TR(a)|P, and, 

— for random R, the distribution of |^(ii)(i(TR(a), Tr( 6))|/|6 — is approx- 
imately 7 o, for 7 o independent of a and 6. 

We then put f{x) = c4>{R)d{Tji{0),Tii{x)) (rounding appropriately from reals 
to integers). We have 

Er [(/(6) - /(a))2] = Er [c^f,^{R){d{TR{0),TR{b)) - d{TR{0),TR{a))f] (8) 

= Er [c^<lr^{R){d{TRia),TR{b))f] (9) 

€0{ER[c^^{R)\TR{b)-TR{a)\P]) (10) 

«0(|6- an. (11) 






An Approximate L^-Difference Algorithm for Massive Data Streams 



199 



Because \d{a,l3)\ S 17 (|/3 — for a significant fraction of a, (3 (according 

to the distribution (a,/3) = (Tji{a),Tji{b))), from the Markov inequality we 
conclude that 



Er [d{TR{a),TR{b)f] en{\TR{b)-TR{a)\P) . (12) 

We have 

Er [{.f{b) - /(a))2] = Er [c^cj,\R){d{TR{a),TR{b))f] (13) 

e 12 {Er [c^<l,\R)\TR{b) - Tfl(a)|P]) (14) 

^f2{\b-a\P). (15) 



It follows that Er [(/(6) — /(o))^] G 0 {\b — a|^). Because the distribution on 
\d{TR{a),Tr{b))/\b — is approximately independent of a and 6 , it follows 
that Er [(/(6) — /(o))^] « c'(l ± e)|6 — a|^, for c' independent of a and b. By 
choosing c appropriately, we can arrange that c' = 1. 

We now proceed with a detailed construction of /. 

3.3 Construction of f 

The function d(a, b) takes the form d{a, b) = J2a<i<b where nj is a ±l-valued 
function of j, related to a function described in TZI 

Lemma 2. For all rational p G (0,2], there exist integers u and v such that 
!og("+u) = p/2 andv-u> 17. 

Proof. If p/2 = aj(3 for a > 5, put v = + 2““^ and u = 2^“^ — 2““^. 

Now, we define a sequence tt of +l’s and — I’s, as follows. Let tt = limi^oo T^{i) 
where 7r(q is defined recursively, for i > 1, as 

^(1) = (+in-i)’' (16) 

^(,+ 1 ) = ( 17 ) 

and 7T(ij denotes 7T(j) with all +l’s replaced by —I’s and all —I’s replaced by 
+l’s. Note that Tr^q is a prefix of 7T(i_|_i). For example, a graph of tt with u = 1 
and u = 3 is given in Figure O (Figure El also describes sets Ss,t, to be defined 
later) . 

Let TTj (as opposed to ttq)) denote the j’th symbol of tt. 

Definition 3. Let d{a,b) = X)j=a ’’’i t/ie discrepancy of tt between +l’s and 
— 1 ’s in the interval [a, 5). 

Note that d and tt depend on u and v. We only consider one set of values for 
It, u at a time and drop them from the notation. 

We will need the following notation: 
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Definition 4 . The randomized rounding \x]p of a real number x by a (ran- 
dom) real p S [0, 1] is defined by 



\\x\,x>p+\x\ 
\ [xj , otherwise 



( 18 ) 



We now define the transformation Tr() and the inverse scale factor (j){R). 

Definition 5 . Let r be (a -\- v)^, where s is chosen uniformly at random from 
the real interval [Ni,N2\. Let r' be an integer chosen uniformly at random from 
[0,A^3). Finally, let p be a uniformly chosen real number in [ 0 , 1 ]. For R = 
{r,r',p), put Tn{a) = [rajp + r' and put (j){R) = r~P^^. 

In this extended abstract, we don’t give Ni,N2, and precisely. The fol- 
lowing is true: 



Ni = log(8)/log(u-hu) 


( 19 ) 


N2 = Ni-i- 0(log(M)/e) 


(20) 


Ns = 0{]VY+^^Y 


(21) 



Let d{a,b) denote (j){R)d{Tji{a),Tr{b)), i.e., d acting in the transformed do- 
main. 

We apply the below properties about tt, d, Tr, and 4>{R) in our proof of the 
main theorem. Some of the following assume that v — u > 17 . The constants 
Cl, C2, c below may depend on p, M, and e, but are bounded uniformly in M and 

e. 



d{{u -\- v)a, {u -\- v)b) = —{v — u)d{a,b). (22) 

For all r, all a < 5 < (zi J- vY and all x, 

|d(a, 6)1 = |c?(a -I- x{u -\- vY, b x{u -\- zj)’’)] . ( 23 ) 

For some C2, |d(a, 6)| < 02(6 — a)^/^ . ( 24 ) 

For some Ci > 0 and some p > 0 ,PrjR ^ d{a,b) > ci|6 — > 77 . ( 25 ) 



For some 70 > 0 , Er 
For some 71 > 0 , Er 
For some 72 > 0 , Er 



d{a,b) = 7 o |(6 - a)|P/2(l ± e) . 



df{a, b) 
d'^(a, b) 



= 7 i|( 6 -a)|P(l±e) . 

= 72|(6-a)pP(l±e) . 



( 26 ) 



We omit the proofs of 112 211 (homogeneity), (periodicity), II24II (upper 
bound), and 112 dll (average) altogether. The proof of (I27II (averaged lower bound) 
consists of two lemmas, that we present without proof. 

Proof (of (ESJ), averaged lower bound). 

We identify a set S of (a, 6) values. We then show in Lemma 0 that |d(a, 6)| 
is large on S and we show in Lemma 0 that the set S itself is big. The result 
follows. 
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Definition 6. Fix u and v. We define a set S of (a,b) values as follows. For 
each integer s and t, let Ss,t consist of the pairs (a, b) such that 

( t{u + + u{u + vY <a<t{u + vY+'^ + {u+l){u + vY 

[ t(zi + + (u + t; — l)(u + u)® < 6 < (t + l)(it + ' 

Let S be the (disjoint) union of all Sg^t- 

One can view this definition geometrically. (See Figure El) 



□ 



12 3 4 



17 18 19 20 



29 30 31 32 



So.o So.i 

I — I — I — h- 



So,2 



Si.o 



So, 3 

H 1 1 1 



5o,4 So, 5 

I — I — I — h- 



So,i 



Si,i 



So,7 

H 1 1 1 



Fig. 2. Geometric view of tt (continuous polygonal curve) for u = 1 and u = 3. The sets 
Ss,t are indicated by segments with vertical ticks. Each element of Sa,t is a pair {a, /3), 
indicated in the diagram by two of the vertical ticks near opposite ends of the interval 
labeled Ss,t- The discrepancy of tt is relatively high over intervals with endpoints in 
Ss,t- Note that the pattern tt and sets Ss,t are self-similar (loosely defined). Elements 
of Ss,Q are close to analogs of 50,0 = {(«, u + u)}, scaled-up (by (m -|- vY). Elements of 
Ss,t are analogs of elements of Sa,o, translated (by t{u + u)®) 



Lemma 3. For some constant Ci, for each (o,6) € S, we have 

\d{a,b)\ > ci\b — . (28) 

Lemma 4. Let a,b < M be arbitrary. Then for any u, v there exists rj such that 
Pr ((TR(a), Tr(5)) G S) >rj. 

3.4 Algorithm in the Sketch Model 

We can use this algorithm in the sketching model, also. Perform 0(log(l/d)/e^) 
parallel repetitions of the following. Given one function (oi), construct the small 
sketch Later, given the two sketches A = ^ = 

^j(Ti/(6i), one can reconstruct the L^ difference by outputting a median of 
means of independent copies of |A — 

3.5 Toplevel Algorithm, 2 < p < 4 

For rational p G (2,4), a similar algorithm (with similar analysis) approximates 



(29) 
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with small relative error, whence one can approximate \ai — bi\P, for 2 < p < 4, 
with error small compared with \ai — We omit the details. 

We now analyze this error guarantee compared with error guarantees of re- 
lated work. In previous work, there are three types of error guarantees given. In 
typical sampling arguments, an additive error is produced which is small com- 
pared with n or even Mn. The techniques of ^ can be used to approximate 
^ \ai — bi\ when a^, bi G {0, 1}. In this case, the error is guaranteed to be small 
compared with ^ |oi -I- bi \ — already a substantial improvement over additive 
error and the best possible in the original context of 0. Relative error, i. e., 
error small compared with the returned value, is better still, and is achievable 
for p G [0,2]. Our error guarantee for p > 2 falls between relative error and 

\tti + bi\): 

joi - 6*1^ < e joi - < e ^ ja* -t- 6i| < O(eMn) . (30) 

In particular, our error in approximating ^ ja^ — 5^1*’ is small compared with 
[ci — 6i|P) , so our error gets small as the returned value ^ joi — 6^]^ gets 
small — this is not true for an error bound of, say, ^ \ai + bi\P or ^ ja^ + bi\. 

Since I®* ~ I®* ~ I®* ~ one could also ap- 
proximate ~ ^*1^ t)y 

. (31) 

1^2 

This will be correct to within the factor (^ ja^ — 5^]^/^) . Note that our al- 

gorithm is an unbiased estimator, «.e., it has the correct mean — an advantage 
in some contexts; this is not true of (^ ja^ — 6^]^’/^)^^^. Furthermore, our algo- 
rithm provides a smooth trade-off between guaranteed error and cost, which is 
not directly possible with the trivial solution. We hope that, in some applica- 
tions, our approximation to ^ ja^ — bi\^ provides information not contained in 

4 Discussion 

4.1 Relationship with Previous Work 

We give an approximation algorithm for, among other cases, p = 1 and p = 2. 
The p = 1 case was first solved in [7|, using different techniques. Our algorithm 
is less efficient in time and space, though by no more than a power. The case 
p = 2 was first solved in [2Cj . and it is easily seen that our algorithm for the 
case p = 2 coincides with the algorithm of Our algorithm is similar to m 
at the top level, using the strategy proposed by |2j. 
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4.2 Random-Self-Reducibility 

Our proof technique can be regarded as exploitation of a random-self-reduction 
p] of the LP difference. Roughly, a function f{x) is random-self-reducible via 
(cr, if, for random r, f{x) = 4>{r, f{a{r, x))), where the distribution 
does not depend on x. The function of two streams is random-self-reducible 
in the sense that \ai — bi\ = - \ {rai + r') — {rbi -|- r')| , where the distribution 
{rai + r\ rbi + r') is only weakly dependent on (a^, bi). We omit further discussion 
due to space considerations. 



4.3 Determination of Constants 



Our function / involves some constant c, such that E[{f{a) — f{b))^] ~ |6— a\P, 
which we do not explicitly provide. This needs to be investigated further. We 
give a few comments here. 

One can approximate c using a randomized experiment. Due to our fairly 
tight upper and lower bounds for c, we can, using Lemma Q estimate c reliably as 
d{a, b) -16—01“^/^. This occurs once for each p, M, n, and e. It is not necessary to 



do this once for each item or even once for each stream, and one can fix generously 
large M and n and generously small e to avoid repeating the estimation of c for 
changes in these values. 

In some practical cases, not knowing c may not be a drawback. In practice, 
as in 15, one may use the measure ^ \ ai — bi\P to quantify the difference between 
two web pages, where at is the number of occurrences of feature i in page A and 
bi is the number of occurrences of feature i in page B. For example, one may 
want to keep a list of non-duplicate web pages, where two web pages that are 
close enough may be deemed to be duplicates. According to this model, there 
are sociological empirical constants c and p such that web pages with small value 
of cJ2Wi~ bi\^ are considered to be duplicates. To apply this model, one must 
estimate the parameters c and p by doing sociological experiments, e.g., by asking 
human subjects whether they think pairs of webpages, with varying measures 
of |oi — bi\P for various values of c and p, are or are not duplicates. If one 
does not know c, one can simply estimate c/c at once by the same sociological 
experiment. 



4.4 Non-grouped Input Representation 

Often, in practice, one wants to compare (ui) and (bi) when the values and bi 
are represented differently. For example, suppose there are two grocery stores, 
A and B, that sell the same type of items. Each time either store sells an item it 
sends a record of this to headquarters in an ongoing stream. Suppose item i sells 
Oi times in store A and bi times in store B. Then headquarters is presented with 
two streams, A and B, such that i appears times in A and bi times in R; ^ joi— 
bi\P measures the extent to which sales differ in the two stores. Unfortunately, 
we don’t see how to apply our algorithm in this situation. Apparently, in order 
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to use our algorithm, each store would have to aggregate sales data and present 
ai or bi, rather than present ai or bi non-grouped occurrences of i. The algorithm 
of |2I1| solves the p = 2 case in the non-grouped case, but the problem for other 
p is important and remains open. 

We have recently learned of a possible solution the non-grouped problem. 
Note that, in general, a solution A in the non-grouped representation yields a 
solution in the function- value representation, since, on input a^, an algorithm 
can simulate A on ai occurrences of i; this simulation takes time exponential 
in the size of ai to process a^. The proposed solution, however, appears to be 
of efficiency comparable to ours in the function-value representation, at least in 
theory, but there may be implementation-related reasons to prefer our algorithm 
in the grouped case. 
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Abstract. In this paper, following the approach of Gocic, Kautz, Pa- 
padimitriou and Selman (1995), we consider the ability of belief revision 
operators to succinctly represent a certain set of models. In particular, 
we show that some of these operators are more efficient than others, 
even though they have the sane model checking complexity. We show 
that these operators are partially ordered, i.e. some of them are not 
comparable. We also strengthen some of the results by Cadoli, Donini, 
Liberatore and Shaerf (1995) by showing that for some of the so called 
“model based” operators, a polynomial size representation does not exist 
even if we allow the new knowledge base to have a non polynomial time 
model checking (namely, either in NP or in co-NP). Finally, we show that 
Dalai’s and Weber’s operators can be compiled one into the other via a 
formalism whose model checking is in NP. All of our results also hold 
when iterated revision, for one or more of the operators, is considered. 



1 Introduction 

Several formalisms for knowledge representation and nonmonotonic reasoning 
have been proposed and studied in the literature. Such formalisms often give rise 
to intractable problems, even when propositional versions of such formalisms are 
considered (see [Zj for a survey) . 

Knowledge eompilation aims to avoid these difficulties through an off-line 
process where a given knowledge base is compiled into an equivalent one that 
supports queries more efficiently. The feasibility of the above approach has been 
deeply investigated depending on several factors such as: the formalism used for 
the original and resulting knowledge base, the kind of equivalence we require, and 
so on (see 0 for a survey). For example, let us consider the propositional version 
of cireumseription {CITZC), a well known form of nonmonotonic reasoning intro- 
duced in the AI literature in Informally, CITZC (T) denotes those truth 

assignments that satisfy T and that have a “minimal” set of variables mapped 
into 1 . The idea behind minimality is to assume that a fact is false whenever pos- 
sible. In particular, we represent a truth assignment as a subset m of variables of 

* Part of this work has been done while the anthor was visiting the research center of 
INRIA Sophia Antipolis (SLOOP project). 
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T (those mapped into 1 ) and we say that m is a model if the corresponding truth 
assignment satisfies T. Then, CTTZC{T) contains only the models of T that are 
minimal w.r.t. set inclusion (see Sect. II .21 for a formal definition). Although it is 
possible to explicitly represent all the models in CJTZC{T), this representation 
in general has size exponential in the size of T. So, a shorter (implicit) represen- 
tation is given by the propositional formula T. However, representing the set of 
models C 1 TZC(T) simply as T yields an overhead from the computational point 
of view. For instance, given T and a subset m of its variables, deciding whether 
TO S CnZC{T) {model checking) is an co-NP-complete problem |^. Notice that 
in the classical propositional logic (VC) a formula F simply represents all of its 
models, so model checking for VC is clearly in P. Similarly, deciding whether a 
formula logically follows from CFTZC{T) {inference) is a J([2“Complete problem 
P], while inference for VC is co-NP-complete. A natural question is therefore: is 
it possible to “translate” C 2 TZC{T) into a propositional formula F and then use 
F (instead of T) to solve the model checking problem in time polynomial in |T|? 
Clearly such translation cannot be performed in polynomial time unless P = NP 
(that is why we need to do it off-line) . Additionally, a necessary condition is the 
size of F to be polynomially bounded in the size of T. A negative answer to 
this question has been given in |S| where the authors proved that, in general, 
|F| is not polynomially bounded in |T|. Informally, this is due to the fact that 
CTTZC allows for representations of the information (i.e. a set of models) that 
are much more “succinct” than any equivalent representation in VC (see 0 for 
more formal definitions of what ‘equivalent’ means). 

The above idea of compiling one formalism into another has been extended 
in P3| where the relative succinctness - also known as compactness or space 
efficiency - of several propositional logical formalisms has been investigated. 
The way two formalisms can be compared is the following. A formalism Fi 
is more efficient than a formalism F2 if: (a) Fi can be compiled into F2 and 
(b) F2 cannot be compiled into F\, where the compilation requires the new 
knowledge base being model equivalent and having size polynomial w.r.t. the 
original one. It is worth observing that, by one hand, succincteness implies non- 
compactability. By the other hand, the converse does not always hold since it 
might be the case that Fi and F2 cannot be compiled one into the other, i.e. 
they are not comparable. A somehow surprising result of m is that formalisms 
having the same model checking time complexity are instead totally ordered in 
terms of succinctness. In this case, succinctness becomes crucial in choosing one 
formalism instead of another to represent the knowledge. 

Another important aspect of nonmonotonic reasoning is that we have to 
deal with uncertain and/or incomplete information. Several criteria for updating 
and/or revising a knowledge base have been proposed pi fTTl IT^ I7T1 I24j . 
Suppose we have a knowledge base T and a new piece of information, repre- 
sented by a formula P, is given. It might be the case that T and P are not 
consistent. In this case the revision of T with P, denoted as T o P, contains 
those models of P defined by means of a belief revision operator ‘o’. The so 
called model based operators define the set of models of T o P as those models of 
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P that are “close” to the models of T. To different definitions of closeness corre- 
spond different revision operators. Syntax based approaches are instead defined 
in terms of syntactic operations on the knowledge base T. In general, model 
based approaches are preferred to syntax based ones because of their syntax ir- 
relevance, i.e. revising two logical equivalent knowledge bases T and T' with a 
formula P always yields the same set of models. Also in this case model check- 
ing and inference become harder than in VC IT^ 0 (see also Table . This 
is a first motivation for investigating compilability of belief revision into VC. 
Moreover, since we are dealing with revision of knowledge, it is often required 
to explicitly compute a propositional formula T' equivalent to T o P, that is 
the revised knowledge base. In such a case it might be desirable not to have an 
exponential increase in the size of the original knowledge base. Unfortunately, 
for several revision operators non-compactability results have been proved in |S| . 
In the same paper also a weaker kind of equivalence has been considered: query 
equivalence. In this case the compilation does not preserve the set of models but 
just the set of formulas that logically follow. So, it can be used for inference but 
not for model checking. In Table [D we summarize both the complexity and the 
compactability results proved for several belief revision operators, both model 
and syntax based (Ginsberg’s and WIDTIO). 

It is interesting to observe that some revision operators and CIV.C have sim- 
ilar properties. For instance, Ginsberg’s operator and CTR.C have the same time 
complexity and the same compactability properties (see 0, Q |H| for the results 
on CWC). It is therefore natural to ask whether this is a chance or not. A 
first study of relationships between belief revision and CXTZC has been done in 
m where the author remarked similarities between CXTZC and her operator. 
Subsequently, in the authors pointed out interesting connections between 
CXTZC and several belief revision operators, thus extending the result of | 23 ] . In 
particular, they proved that CXTZC can be compactly represented by means of 
several belief revision operators, i.e. given a propositional formula F, two for- 
mulas T and P, of size polynomial w.r.t. |F|, exist such that T o P is logically 
equivalent to CXTZC {F). As remarked in this allows to import results from 
one field into the other. For example, the above mentioned result combined with 
the non-compactability results of CXTZC can be used to prove several of the neg- 
ative results in jSj. Also inverse reductions have been investigated, i.e. compiling 
belief revision into CXTZC, but in this case query equivalence (instead of model 
equivalence) is considered. In P], among other results, a precise characteriza- 
tions of compactability properties of Ginsberg’s operators is given. In fact, it 
can be compiled in CXTZC and vice versa. Additionally, such result also holds for 
the case of iterated revision, i.e. when a polynomial number of revision steps is 
considered, by making use of the fact that also in this case the model checking 
is in co-NP |1()|. 

Finally, we remark that all of the non compilability results in and 

some of those in H3] relies on the standard hypothesis that the polynomial 
hierarchy does not collapse. Moreover, the results in |S| hold if and only if this 
hypothesis is true. 
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Operator 


Complexity 


Compactability 


Model Checking |1^ 


Inference jSj 


Model [g 


Query jg 


Ginsberg 


co-NP-complete 


1 [((-complete 


No 


No 


Winslett, Borgida, 
Forbus, Satoh. 






No 


No 


Dalai 


piNF[0(iog„)]_^^^p 


piNF[0(iog„)]_^^^p^ 


No 


Yes 


Weber 


) (((-complete 


((-comp. T9[ & [§.J 


No 


Yes 


WIDTIO 


) (((-complete 


,-comp. [121 & P 


Yes 


Yes 



Table 1. Previous results: complexity and compactability of belief revision op- 
erators. 

1.1 Results of the Paper 

In this paper we give a better characterization of (non) compactability prop- 
erties of belief revision and we provide important connections between such 
operators and CXTZC. We consider the model based revision operators in Table E 
and we compare their space efficiency with that of CXTZC, as well as their rela- 
tive compactness. In particular, we show that, for some model based operators 
(Winslett’s, Borgida’s, Forbus’s, and Satoh’s) belief revision is more difficult to 
be compiled than CXTZC, Ginsberg’s, Dalai’s or Weber’s revision. For the latter 
two operators we give a precise characterization of their compactability prop- 
erties. Our results significantly strengthen several non-compactability results in 
0 and the results of JE). Moreover, they provide an intuitive explanation of 
the (non) compactability results when query equivalence is considered. The re- 
sults are obtained under the assumption that the polynomial hierarchy does not 
collapse. 

To this aim, we introduce a formalism, denoted as CXTZC, whose model check- 
ing is in NP and is not comparable to CXTZC. Roughly speaking, CXTZC can be 
seen as the “complement” of CXTZC, i.e. CXTZC corresponds to the set of non min- 
imal models of a propositional formula. In Fig. Ewe show relationships among 
revision operators and their space efficiency with respect to X’C, CXTZC and 
CXTZC, where one way arrows represent the fact that one operator is strictly 
more succinct then another. The results are consequences of previously known 
results combined with the following two: 

- CXTZC can be compiled into model based operators; 

- Dalai’s and Weber’s operators can be compiled into CXTZC. 

As a consequence we have that Winslett’s, Borgida’s, Forbus’s and Satoh’s op- 
erators are more succinct than all the other operators, and Ginsberg’s operator 
is not comparable to Dalai’s or Weber’s one. Moreover, Dalai’s and Weber’s can 
be reduced each other via CXTZC. This yields a precise characterization of their 
space efficiency w.r.t. CXTZC and the other belief revision operators. Addition- 
ally, the fact that Dalai’s and Weber’s operators are equivalent to CXTZC gives an 
intuitive explanation of their query compactability properties [S| (see Table EJ . 

Motivated by the non-compactability results of 0, we attempt to find a 
trade-off between compactability and the complexity of model checking of the 
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Winslett, Borgida, Forbus, Satoh 




Fig. 1. Results of the paper: the relative succinctness of belief revision operators, 
where means that the result also holds for iterated revision. 



knowledge base in which the original one is compiled. In particular, we consider 
the following question: 

Can we succinctly represent a revised knowledge base by means of CTRCl 
More generally, can it be compiled into a knowledge base whose model 
checking is either in NP or in co-NP? 

Since model checking for model based operators is harder than any problem in 
NP or in co-NP, a positive answer to the above question can be used to make 
model checking easier through an off-line preprocessing of compilation. In Table 0 
we summarized the obtained results, which follow from properties of CTTZC and 

cTrc. 



Operator 


Compactable into a knowledge base 
whose model checking is in 


NP 


co-NP 


Ginsberg 


No 

Corollary ^ 


Yes, also iterated 

115101101 


Winslett 


No 

Corollary 0 & 


No 

Corollary 0 


Borgida 


No 

Corollary 0 & 


No 

Corollary 0 


Satoh 


No 

Corollary ^ 


No 

Corollary 0 


Dalai 


Yes, also iterated 
0, also Theorem E 


No 

Corollary 0 


Weber 


Yes, also iterated 
[Sj, also Theorem E 


No 

Corollary 0 



Table 2. Results of the paper: compactability w.r.t. model checking time com- 
plexity of the new knowledge base. 
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It is worth observing that: 

— None of the model based operators admits compact representations whose 
model checking is in co-NP. 

— Dalai’s and Weber’s operators admit compact representations whose model 
checking is in NP, even when iterated revision is considered. 

— None of the other model based operators admits compact representations 
whose model checking is in NP. 

The latter result strengthen the negative result proved in in that no model 
equivalent knowledge base exists even when we allow its model checking to be 
either in NP or in co-NP. Additionally, it is not possible to compile a model 
based revision operator into CITZC, thus implying that the result of [T^ (which 
holds in the case of query equivalence) cannot be extended to model equivalence. 

We emphasize that the compactness of belief revision operators, in general, 
does not seem to depend on either the complexity of inference and model check- 
ing or the previously known compactability results. For instance, Winslett’s and 
Weber’s ones have the same complexity (see Table Q while they are ordered in 
terms of space efficiency (see Fig. ^1. Additionally, Dalai’s and Weber’s, that 
have different complexity, can be compiled one into the other, instead. The re- 
ducibility of those two operators to CTTZC also gives an intuitive explanation of 
their compactability properties (see Table Q which, actually, are the same as 

cTnc 

Due to lack of space some of the proofs of the above results will be omitted 
or only sketched in this extended abstract. 

1.2 Preliminaries 

Given a propositional formula F and given a truth assignment m to the variables 
of F, we say that to is a model of F if to satisfies F. Models will be denoted as 
sets of variables (those mapped into 1). We denote by M{F) the set of models of 
F. A theory is a set T of propositional formulas. The set of models Ai(T) of the 
theory T is the set of models that satisfy all of the formulas in T. If A4{F) ^ 0 
then the formula is satisfiable. Similarly, a theory T is consistent if M{T) ^ 0. 
We use a ^ b and a ^ b as a. shorthand for V b and (a A 6) V {^a A ~^b), 
respectively. Given two models to and n, we denote by mAn their symmetric 
difference. Given a set of sets S, we denote by mine S (respectively, maxc 5), 
the minimal (respectively, maximal) subset of 5 w.r.t. set inclusion. The circum- 
scription of a propositional formula F (denoted by CFTZC{F)) is defined as 

CJTZC{F) = {m € M{F) \ Vto' C to, to' ^ Ad(F)} = mine Ad(F). 

Given a theory T and a propositional formula P, we denote by T o P the theory 
T revised with P according to some belief revision operator o. We distinguish 
the following belief revision operators: 

Ginsberg. Let W(r, P) = maxg {T' CT \ P' U {P} ^T}. Then 
PogP={P'U{P} I T'eW(T,P)}. 
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Winslett. Let ^{m,P) = mine {mAn \ n € M{P)} . Then, 

M{T P) = {n G M{P) I 3m G M{T) : mAn G ii(m, P)} . 

Borgida. It is defined as if TU {P} is not consistent, and it is defined as 
r U {P} otherwise. 

Forbus. For any two models mi and m 2 let d(mi,m 2 ) = |miZ\m 2 |. Also let 
km,P = min{d(m,n) : n G M{P)}. Then, 
m{t Op P) = {n G Ai{P) I 3m G A4(T) : d(m, n) = km,p} ■ 

Satoh. Let S{T,P) = minc{UmGM(p) -P)}- Then, 

M{T os P) = {n G M{P) I 3mG M{T) ■.mAnG6{T,P)}. 

Dalai. Let kp^p = mhi{km.p \ m G M(T)}. Then, 

M{T Op, P) = {n G M{P) I 3m G M{T) : d{m,n) = kp,p}- 
Weber. Let Q = lj(5(r, P), i.e. Q contains all of the variables appearing on a 
minimal difference between models of T and model of P. Then, 

M{T P) = {n G A4(P) I 3m G Ai(T) : mAn C 17}. 

An advise taking Turing machine is a Turing machine that can access an advice 
a(n), i.e. an “oracle” whose output depends only on the size n of the input. The 
class NP/poly is the class of those languages that are accepted by a nondeter- 
ministic Turing machine with an advice of polynomial size (see m for a formal 
definition). The class co-NP/poly is similarly defined. In m non-uniform classes 
such as NP/poly and the polynomial hierarchy have been related. In particular, 
it has been proved that if NP C co-NP / poly or co-NP C NP / poly then the poly- 
nomial hierarchy (denoted by PH) collapses at the third level, i.e. PH = (see 
m for a formal definition of those concepts), which is considered very unlikely 
in the complexity community. 

2 The Complemented Circumscription and Its Properties 

In this section we introduce CTTZC and state its basic (non) compactability 
properties that will be used in the rest of the paper. To this aim we first introduce 
a model equivalence preserving reduction used in P| and we assume that a 
knowledge base AT in a formalism T represents a set of models P{K). 

Definition 1 ([3J). Given two logical formalisms Pi and Fi 1 — > P 2 if the 
following holds: for each knowledge base Ki in Pi, there exists a knowledge base 
K 2 in P 2 and a polynomial time computable function gp, such that (i) for any set 
of variables mi, mi G Pi(Ki) gKi(mi) G p 2 {K 2 )i (H) \K 2 \ is polynomially 
bounded in \Ki\. 

The above definition implies that, once we have computed (off-line) the for- 
mula K 2 , we can decide whether mi is a model of Ki, by checking if gpiimi) 
is a model of K 2 - Additionally, gp,^{mi) can be computed in polynomial time. 
Finally, the relation is transitive. 

Definition 2. We denote by CTTZC{F) the set of non minimal models of a 
propositional formula F, that is 



CmC{F) = M{F) \ CinC{F) = {m G M{F) \ 3m' G M{F) :m' Cm}. 
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Lemma 1 For any n a propositional formula ( of size polynomial 

inn) exists such that for every n-variables 3CNF propositional formula f there 
exists a model irif (computable in polynomial time) such that 



The above lemma states that the circumscription of a formula Fn is able 
to capture all of the unsatisfiable 3CNF formulas with n variables (notice that 
Fn depends only on n). As a consequence we obtain the following result, whose 
proof is similar to non compilability proofs given in Pig. 

Theorem 1. The following hold: (i) CITZC CITZC ^ co-NP C NP/poly; (ii) 
CTnC ^ CITZC ^ NP C co-NP/poly. 

The above theorem can be easily generalized to any two formalisms whose 
model checking is in co-NP and NP, respectively. Thus, the following corollary 
holds. 

Corollary 1. LetJ-^Q -^p andiF-^p be any two formalism whose model checking 
is in co-NP and NP, respectively. Unless the polynomial hierarchy collapses at 
the third level, the following two hold: (i) CITZC ^ CITZC ^ -^co-NP- 

In the rest of the paper we will make use of the above result and thus we will 
always assume PH ^ 



3 Reducing CTl^C to Belief Revision 

In this section we provide some reductions from CITZC to any of the model based 
belief revision operators. As a consequence we have that none of such operators 
can be compactly represented by CITZC or by og - 

Theorem 2. CITZC i— > CITZC ob , CITZC i— > op. 

Proof. We will prove the theorem only for the operator, since the proof can 
be easily adapted to the other two operators. Let -F be a propositional formula 
over the variables xi, . . . ,Xn. We show that two formulas T and P of polynomial 
size exist such that M{T o^y^n F) = CITZC {F). 

Let yi, . . . ,yn be a set of new variables in correspondence one-to-one with 
xi, . . . ,Xn and let be the set {yi\xi G m}. We construct two formulas T and 
P over the set of variables x\, . . . ,Xn,yi, ■ ■ ■ ,yn such that 



and A4{P) = Ai{F). Let us observe that if no two models mi, m 2 G Ai{F) exist 
such that m 2 C mi, then T is not satisfiable. We will see in the sequel how to 
deal with that case. Thus, let us suppose A4(T) ^ 0 and let 



/ is unsatisfiable G CITZC {Fn). 



M{T) = {mi\Jm\ \ mi, m 2 G Ad(F), TO 2 C mi} 







m2 Cmi 
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and P = F A /\ ^yt. We now prove that m S M(T owin P) AA m £ CXTZC{F). 

i=l 

(=>) By the definition of OY/in we have that a model G A4(T) exists such 
that m G fi{rrr^,P). Let = mi U where mi, m 2 G A4(F) and m 2 C 
mi- Let us first observe that, since m does not contain any variable yi, 

mAmF = (mAmi) U 

We now prove that m = toi. Suppose, by contradiction, that mAmi ^ 0. 
Then, mi Am"’" = m\ C mAm?" , which implies that m ^ y{m’" ,P), thus a 
contradiction. So, m 2 C mi = m, that is m G CJTZC{F). 

{^) There exists m 2 G M.{F) such that m 2 C m. Let = m U m\. Clearly 
m’" G M.{T). Suppose by contradiction that m ^ fi{m^,P). Thus, an mi G 
M{P) exists such that miAm?" = (miAm) U m| C mAm’" = m^, thus a 
contradiction. 

We now consider the case in which no two models mi, m 2 G Ai{F) exist such 
that m 2 C mi. To this aim, we have to slightly modify the above construction 
and consider the formula F' = F M where Xn+i is a new variable. Let 

T' and P' be the formulas obtained by replacing F with F' in the definition of 
T and P, respectively. Let x = {xi, . . . , Xn+i}- We then have that 

M{T' owiu P') = Cn^{F') = CTT^{F) U {x}. 

Finally, the above reduction also apply to o^, while it can be easily adapted 
for og (it suffices to guarantee that T' AP' is not consistent). Hence, the theorem 
follows. 

Theorem 3. CJTZC 1 -^ o^j, CJTZC i->- CFTZC 1 — > og. 

Proof, {sketch of) First of all we slightly modify the formulas T and P of The- 
orem |21 as follows: 

/n+l \ /n+l \ /n+l 

T = F' A F'[xjyi] A ^ a;* ^ y* j A ( ^ Xi j A ( /\ ^ Zi 

' ^ 

mi2'^2 m.2Qmi 

n+l n+l 

and P = F' A f\ ~^yi /\ ~^Zi, where F' is defined as in the proof of Theorem 0 

i—1 i—1 

In the case of ojj, the proof is a consequence of the following claims: 

Claim 1: kx.p = n -I- 1. 

Claim 2: For all m G CXTZC{F) , there exists m’" G A4(T) such that d{m, m"’") = 
n+l. 

Claim 3: For any m G CXTZC{F) and for all m"’" G M.{T), d{m, m?") > n + 1. 

As far as o^/eb and og concerns, we first observe that 5{T, P) does not contain 
any variable Xi. Moreover, it is easy to see that, for any n G CXTZC{F), and for 
any m"’" G A4{T), nAm"’" contains at least one variable Xi. This proves the 
theorem. 




214 



Paolo Penna 



4 (Non) Compactability of Model Based Revision 

In this section we consider the problem of compiling the revised knowledge base 
into a model equivalent one that has model checking either in NP or co-NP. To 
this aim we will denote by and i^co-NP formalism^ whose model 

checking is in NP and co-NP, respectively. 

Let us first observe that an immediate consequence of the reductions given 
in Sect. 0and of Corollary 0 is the following fact. 

Corollary 2. For any o e of, on, oweb}, ° fA -^co-NP- 

The above result implies that such operators cannot be represented by means 
of CXTZC. Motivated by this fact we ask whether it is possible to obtain compact 
representations of Af (T o P) by means of In this case, we show that the 

situation is more tangled. 

We first consider Dalai’s and Weber’s revision and show that they admit a 
compact representation by means of CTTZC. 

The main idea of the reductions is that both Ut.p and 17 can be represented 
in polynomial space {kp.p is an integer and |17| < n). Moreover, once those 
two entities have been computed (off-line), then the problem of deciding m G 
A4{T o P) is in NP for both the operators. 

Theorem 4. ojj i— > CTTZC, i-^- CTTZC. 

The above result can be easily extended to the case of a polynomial number 
of revision steps. Notice that a different proof can be derived by making use of 
the fact that such two operators are query compactable |S|. 

We now consider the other revision operators. To this aim we combine the 
results proved in Sect. 0with the results given in na In particular we exploit 
the fact that such operators can be used to represent CTTZC. 

Theorem 5 ([14]). For any o G {owin,°B,°F,°s}, CTTZC i— > o 

The above theorem combined with Corollary Q yields the following result. 

Corollary 3. For any o G {orvm, of, os}, ° fA -^NP- 



5 Succinctness of Belief Revision 

We compare the space efficiency of the belief revision operators and consider the 
problem of compiling one operator into another. By combining our results with 
previously known results we will obtain the partial ordering shown in Fig. Q 
To this aim we first introduce the following notation. 

^ In this case the term formalism is quite general, since it refers to any representation 
of a set of models. 
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Definition 3. For any two logical formalisms T\ and Ti'- (i) ^ if both 

T\ I— > T2 and T2 iA T\; (ii) T\ « T2 if T\ T2 and T2 T\; (Hi) T\ ^ T2 
if both T\ lA A2 and A2 A -^i • 

All of the following results are easy consequences of Theorem El Theorem El 
and CorollaryEl We first compare and operators with the other model 
based ones. All of the results also hold for a polynomial number of revision steps 
of oo or 

Corollary 4. For any o e {orvin, Ofi, °F, os}, °d ~ °Web ~ CITIC -< o. 
Corollary 5. For any o g {o£>, o^eb}, CXTZC 9 ^ o and oq 96 o. 

6 Conclusions and Open Problems 

We have shown that belief revision operators with the same model checking 
and inference complexity have different behaviours in terms of compilability and 
space efficiency. We precisely characterized the space efficiency of o^i and 
which, following the definitions of | 3 |, are model-NP-complete. Moreover, our 
results combined with those in P! imply that o^, op and 05 are both 

model-NP-hard and model-co-NP-hard. 

The first problem left open is that of finding similar characterizations for the 
latter operators, as well as that of understanding their relative space efficiency. 
More generally, it could be interesting to investigate relationships with other 
formalisms considered in uni 0] such as default logic, model preference and 
autoepistemic logic. Furthermore, compactability results for the case of iterated 
revision are not known. Do these operators became even harder to be compacted 
when more than one step of revision is considered? 

It is interesting to observe that a different situation occurs when query equiv- 
alence is considered. Indeed, in m the authors proved that in this case oq and 
05 can be reduced one to the other and can be reduced to both. 

Acknowledgments. I am grateful to Marco Cadoli for introducing me to the 
area of knowledge compilation and for several useful discussions. I am also grate- 
ful to Riccardo Silvestri for his valuable comments on a preliminary version of 
this work. 
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Abstract. Well-known theorems of Hanf’s and Gaifman’s establishing 
locality of first-order definable properties have been used in many appli- 
cations. These theorems were recently generalized to other logics, which 
led to new applications in descriptive complexity and database theory. 
However, a logical characterization of local properties that correspond to 
Hanf’s and Gaifman’s theorems, is still lacking. Such a characterization 
only exists for structures of bounded valence. 

In this paper, we give logical characterizations of local properties behind 
Hanf’s and Gaifman’s theorems. We first deal with an infinitary logic 
with counting terms and quantifiers, that is known to capture Hanf- 
locality on structures of bounded valence. We show that testing iso- 
morphism of neighborhoods can be added to it without violating Hanf- 
locality, while increasing its expressive power. We then show that adding 
local second-order quantification to it captures precisely all Hanf-local 
properties. To capture Gaifman-locality, one must also add a (potentially 
infinite) case statement. We further show that the hierarchy based on 
the number of variants in the case statement is strict. 



1 Introduction 

It is well known that first-order logic (FO) only expresses local properties. Two 
best known formal results stating locality of FO are Hanf’s and Gaifman’s the- 
orems nag. They both found numerous applications in computer science, due 
to the fact that they are among relatively few results in first-order model theory 
that extend to finite structures. Gaifman’s theorem itself works for both finite 
and infinite structures, while for Hanf’s theorem an extension to finite structures 
was formulated by Fagin, Stockmeyer, and Vardi 0. 

More recently, the statements underlying Hanf’s and Gaifman’s theorems 
have been abstracted from the statements of the theorems, and used in their 
own right. In essence, Hanf’s theorem states that two structures cannot be dis- 
tinguished by sentences of quantifier rank k whenever they realize the same mul- 
tiset of d-neighborhoods of points; here d depends only on k. Gaifman’s theorem 
states that in a given structure, two tuples cannot be distinguished by formulae 
of quantifier rank k whenever d-neighborhoods of these tuples are isomorphic; 
again d is determined by k. 

* Part of this work was done while visiting INRIA. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 217 - 172^1 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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It was shown that Hanf’s theorem is strictly stronger than Gaifman’s, and 
that both apply to a variety of logics that extend FO with counting mechanisms 
and limited infinitary connectives |1 1 II 411 511 hl22| . Since the complexity class 
TC*^ (with the appropriate notion of uniformity) can be captured by FO with 
counting quantifiers these results found applications in descriptive complex- 
ity, where they were used to prove lower bounds for logics coming very close 
to capturing TC° EU. They were also applied in database theory, where they 
were used to prove expressivity bounds for relational query languages with ag- 
gregation that correspond to practical query languages such as SQL. For 

applications to automata, see 1241 . 



The abstract notions of locality were themselves characterized only on finite 
structures of bounded valence (e.g., for graphs of fixed maximum degree). The 
characterization for Hanf-locality uses a logic £^^(C) introduced in [IS] as a 
counterpart of a finite variable logic While subsumes a number of 

fixpoint logics and is easier to study, £^^(C) subsumes a number of counting 
extensions of FO (such as FO with counting quantifiers ^2j, FO with unary 
generalized quantifiers HSCBl, FO with unary counters |2|) and is quite easy 
to deal with. A result in HH states that Hanf-local properties on structures of 
bounded valence are precisely those definable in £^^(C). 



The question naturally arises whether this continues to hold for arbitrary 
finite structures. We show in this paper that this is not the case. We do so by 
first finding a simple direct proof of Hanf-locality of £^^(C), and then using it 
to show that adding new atomic formulae testing isomorphism of neighborhoods 
of a fixed radius does not violate Hanf-locality, while strictly increasing the 
expressive power. We next define a logic that captures precisely the Hanf-local 
properties. It is obtained by adding local second-order quantification to £1^^^{C). 
That is, second-order quantifiers bind predicates that are only allowed to range 
over fixed radius neighborhoods of free first-order variables. We will also show 
that this amounts to adding arbitrarily powerful computations to £J^^(C) as 
long as they are bound to some neighborhoods. 

For Gaifman-locality, a characterization theorem in HH stated that it is 
equivalent, over structures of bounded valence, to first-order definition by cases. 
That is, there are m > 0 classes of structures and m FO formulae ipi such that 
over the Ah class, the given property is described by ipi. Again, this falls short 
of a general characterization. We show that over the class of all finite structures 
(no restriction on valence), Gaifman-locality is equivalent to definition by cases, 
where the number of classes can be infinite. Furthermore, the hierarchy given by 
the number of those classes (that is, the number of cases) is strict. 
Organization. Section 0 introduces notations and notions of locality. Section 
El gives a new simple proof of Hanf-locality of £‘^^{C) which is then used to 
show that adding tests for neighborhood isomorphism preserves locality. Section 
El characterizes Hanf-local properties as those definable in £^^(C) with local 
second-order quantification. Section 0 characterizes Gaifman-local properties as 
those definable by (finite or infinite) case statements, and shows the strictness 
of the hierarchy. All proofs can be found in the full version ED). 
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2 Notation 

Finite structures and neighborhoods All structures are assumed to be finite. A 
relational signature tr is a set of relation symbols {i?i, Ri}, with associ- 
ated arities pi > 0. A cr-structure is A = . . . ,Rf), where A is a finite 

set, and R-f C interprets Ri. The class of finite cr-structures is denoted by 
STRUCT[cr]. When there is no confusion, we write Ri in place of Rf^. Isomor- 
phism is denoted by =. The carrier of a structure A is always denoted by A and 
the carrier of B is denoted by B. 

Given a structure A, its Gaifman graph G{A) is defined as {A,E) 

where (a, b) is in E iff there is a tuple c S R'f for some i such that both a and b 
are in c. The distance d{a, b) is defined as the length of the shortest path from 
a to 6 in Q{A)] we assume d(a, a) = 0. If a = (oi, . . . , a„) and b = (6i, . . . , 6m), 
then d{a,b) = minij d{ai,bj)- Given a over A, its r-sphere Sf^{a) is {6 S A | 
d{d, b) < r}. Its r -neighborhood Nf^{a) is defined as a structure in the signature 
that extends cr with n new constant symbols: 

{Sf{d),R^ n ,...,Rfn S^{ay ‘ , ai, . . . , a„) 

That is, the carrier of is the interpretation of the tr-relations 

is inherited from A, and the n extra constants are the elements of a. If A is 
understood, we write Sr {a) and Nr (a). 

If A, S STRUGT[(t], and there is an isomorphism N^{b) (that 

sends a to 6), we write a 6. If A = B, we write a 6. 

Given a tuple a = (ai, . . . , On), we write ac for the tuple (oi, . . . , a„, c). 

The quantifier rank of a formula is denoted by qr(-). 

Hanf’s and Gaifman’s theorems An m-ary query on cr-structures, Q, is a mapping 
that associates to each A G STRUGT [cr] a structure (A, S'), where S C A"*. We 
always assume that queries are invariant under isomorphisms. We write a G Q(A) 
if a G S, where (A, S) = Q{A). A query Q is definable in a logic £ if there exists 
an £ formula ip{xi, . . . ,Xm) such that Q{A) = (A, {a | A </3(a)}). If m = 0, 
then Q is naturally associated with a subclass of STRUGT [cr] and definability 
means definability by a sentence of £. 

Definition 1. (cf. |4ll4| i An m-ary query Q, m>l, is called Gaifman-local if 
there exists a number r > 0 such that, for any structure A and any a,b G A™ 

a b implies a G Q{A) iff b G Q{A). 

The minimum such r is called the locality rank of Q, and is denoted by lr(Q). □ 



Theorem 1 (Gaifman). Every FO formula <p{xi , . . . , Xm) defines a Gaifman- 
local query Q with lr(Q) < _ l)/2. 
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The statement of Gaifman’s theorem actually provides more information 
about FO definable properties; it states that every formula is a Boolean combi- 
nation of sentences of a special form and open formulae in which quantifiers are 
restricted to certain neighborhoods. However, it is the above statement that is 
used in most applications for proving expressivity bounds, and it also extends 
beyond FO. Note also that better bounds of the order are known for 

lr(Q), see ITTl . 

For A,B & STRUCT[(t], we write if the multisets of isomorphism 

types of d-neighborhoods of points are the same in A and B. That is, A^jB 
if there exists a bijection f : A ^ B such that N^{a) = N^{f{a)) for every 
a G A. We also write (A, a)^^{B, b) if there is a bijection f : A ^ B such that 
N^{ac) = N^(bf{c)) for every cG A. 

Definition 2 (Hanf- locality), (see [1 2f7| 1 4^ ) An m-ary query Q, m > 0, is 

called Hanf-local if there exist a number d > 0 such that for any two structures 
A, B and any a G A™, b G 

(A,d)^^(B,b) implies a G Q{A) iff bGQ{B). 

The minimum d for which this holds is called Hanf locality rank of Q, and is 
denoted by hlr(Q). 

For a Boolean query Q (to = 0) this means that Q cannot distinguish two 
structures A and B whenever A^j^B. 

Theorem 2 (Hanf, Fagin-Stockmeyer-Vardi). Every FO sentence <P de- 
fines a Hanf-local Boolean query Q with hlr(Q) < □ 

An extension to open formulae, although easily derivable from the proof of 
□ , was probably first explicitly stated in [H: every FO formula <p{x) defines 
a Hanf-local query. Better bounds of the order 0(2'’’'*^“'’^) are also known for 
Hanf-locality irnrm . 

It was shown in HD that every Hanf-local TO-ary query, to > 1, is Gaifman- 
local. 

Logic £^^(C) The logic £^^(C) subsumes a number of counting extensions of 
FO, such as FO with counting quantifiers [tif 1 7] . unary quantifiers PI, and unary 
counters |2]. (When we speak of counting extensions of FO, we mean extensions 
that only add a counting mechanism, as opposed to those - extensively studied 
in the literature, see - that add both counting and fixpoint.) It is a two- 

sorted logic, with one sort being the universe of a finite structure, and the other 
sort being N, and it uses counting terms that produce constants of the second 
sort, similarly to the logics studied in P33- The formal definition is as follows. 

We denote the infinitary logic by Coou] it extends FO by allowing infinite 
conjunctions f\ and disjunctions \J . Then £oou,(C) is a two-sorted logic, that 
extends Coo^. Its structures are of the form {A, N), where A is a finite relational 
structure, and N is a copy of natural numbers. We shall use x, y, etc., for variables 
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ranging over the first (non-numerical) sort, and i,j, etc., for variables ranging 
over the second (numerical) sort. Assume that every constant n G N is a second- 
sort term. To £oow) add counting quantifiers 3ix for every j G N, and counting 
terms: If is a formula and x is a tuple of free first-sort variables in ip, then 
ffx.p is a term of the second sort, and its free variables are those in p except 
X. Its interpretation is the number of a over the finite first-sort universe that 
satisfy p. That is, given a structure A, a formula p{x,y-,f), b C A, and jo C N, 
the value of the term ffx.p(x,b;fo) is the cardinality of the (finite) set {a C A | 
A 1= p(a, b; fo)}. For example, the interpretation of ffx.E{x,y) is the in-degree 
of node y in a graph with the edge-relation E. The interpretation of 3ixp is 
ffx.p > i. 

As this logic is too powerful (it expresses every property of finite structures), 
we restrict it by means of the rank of formulae and terms, denoted by rk. It is 
defined as quantifier rank (that is, it is 0 for variables and constants n G N, 
rk{\J^pi) = maxi rk((^i), rk(^(/3) = rk{p) , rk{3xp) = rk{3ixp) = rk{p) + 1) but 
does not take into account quantification over N: rk{3ip) = rk{p). Furthermore, 
rk{ffx.’4>) = rk(gl))-\- \ x |, and the rank of an atomic formula is the maximum 
rank of a term in it. 

Definition 3. (see m ) The logic £^^(C) is defined to be the restriction of 
k3oouj{C) to terms and formulae of finite rank. 

It is known m that £^^(C) is closed under finitary Boolean connectives 
and all quantification, and that every predicate on N x . . . x N is definable by 
a £^,^(C) formula of rank 0. Thus, we assume that -I-, *, — , <, and in fact ev- 
ery predicate on N is available. Furthermore, counting terms can be eliminated 
in £^^(C) without increasing the rank (that is, counting quantifiers suffice, 
although expressing properties with just counting quantifiers is often quite awk- 
ward) . 

Fact 3 (see mm) Queries expressed by £J^^(C) formulae without free vari- 
ables of the second-sort are Hanf-local and Gaifman-local. □ 

Gaifman-locality of was proved by a simple direct argument in ^01; 

Hanf-locality was shown in using bijective Ehrenfeuct- Frails se games of HS|. 

Structures of bounded valence (degree) If A G STRUCT[cr], and Ri is of arity pi, 
then degree , a) for 1 < j < Pi is the number of tuples a in RA having a in 
the jth position. In the case of directed graphs, this gives us the usual notions 
of in- and out-degree. By degset{A) we mean the set of all degrees realized 
in A. We use the notation STRUCTfe[tj] for {A G STRUCT[(t] | degset{A) C 
{0,1,..., A:}}. 

Fact 4 (see For any fixed k, a query Q on STRUCTfc[(r] is Flanf-local iff 
it is expressed by a formula o/£J^^(C) (without free second-sort variables). □ 



222 



Leonid Libkin 



An m-ary query Q on a class C C STRUCT[(t] is given by a first-order 
definition by cases if there exists a number p, a partition C — Ci U C 2 D ... U Cp 
and first order formulae ai{xi, . . . , Xm), ■ ■ ■ , cxp{xi, . . . , Xm) in the language cr 
such that on all structures A G Ci, Q is definable by at. That is, for all 1 < i < p 
and A G Ci, a G Q{A) iS A\= ai{a). 

Fact 5 (see m) For any fixed k, a query Q on STRUCTfe[tj] is Gaif man-local 
iff it is given by a first-order definition by cases. □ 

3 Isomorphism of Neighborhoods and 

We start with a slightly modified definition of locality that makes it convenient 
to work with two-sorted logics, like We say that such a logic expresses 

Hanf-local (or Gaifman-local) queries if for every formula p{x, i) there exists a 
number d such that for every iq C N, the formula ipno (x) = <f{x, zq) (without 
free second-sort variables) expresses a query Q with hlr(Q) < d (lr(Q) < d, 
respectively) . 

Consider a set 6 of relation symbols, disjoint from cr, and define £^^(C) -|- 6 
by allowing for each /c-ary U G 0 and a fc-tuple x of variables of the first sort, 
U{x) to be a new atomic formula. The rank of this formula is 0. Assume that 
we fix a semantics of predicates from 9. We then say that 9 is Hanf-local if there 
exists a number d such that each predicate in 9 defines a Hanf-local query Q 
with hlr(Q) < d. 

Theorem 6. Let 9 be Hanf-local. Then £^^(C) -|- 9 expresses only Hanf-local 
queries. 

Proof sketch. Let d witness Hanf-locality of 9. We show that every £^^^(0) -f 9 
formula of rank m defines a Hanf-local query Q with hlr(Q) < 3"* • d-l- (3"* — 1) /2 
(for all instantiations of free variables of the second sort). 

The proof is by induction on a formula. The atomic case follows from the 
assumption that 9 is Hanf-local. The cases of Boolean and infinitary connectives, 
as well as negation and quantification over the numerical sort are simple. It 
remains to consider the case of ip{x, i) = 3iy{Lp{y, x, z)) (as counting terms can be 
eliminated without increasing the rank m) and to show that if ip defines a query 
of Hanf locality rank r for every zg, then ip defines a query Q with hlr(Q) < 3r-|-l. 
For this, we need the following result from if (A, a)^ 3 y._|_i(S, b), then there 
exists a bijection f ■. A ^ B such that {A,dc)^^{B,bf{c)) for all c G A. We 
then fix iq and assume (A, a)^ 3 r_|_i(/S, 6). Then, for / as above, it is the case 
that A \= iff S |= p(/(c), 6 , z), due to Hanf-locality of p, and thus 

A ^ "0(a, z) iA B \= %f{h,i), as the number of elements satisfying and 

<p{-,b,i) is the same. This completes the proof. □ 

We now consider the following example. For each d, k, define a 2/c-ary pred- 
icate I^{xi, . . . ,Xk,yi, ■ . ■ ,yk) to be interpreted as follows: A ^ I^{a,b) iff 
N^{a) = N^{b). Clearly, (A, aia 2 )^d(;S, 6162 ) implies N^{aia 2 ) = N^{bib 2 ), 
and thus ai £2 iff bi 62 - This shows Hanf-locality of and gives us 
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Corollary 1. For any fixed d, I A: > 0} only expresses Hanf-local 

properties. □ 

We next show that this gives us an increase in expressive power. The result 
below is proved using bijective games. 

Proposition 1. For any d,k > 0, + ij^ is strictly more expressive than 



Corollary 2. The logic £^(^(C) fails to capture Hanf-local properties over ar- 
bitrary finite structures. □ 

Note that we only used I^s as atomic formulae. A natural extension 
would be to use them as generalized quantifiers. In this case we extend 
the definition of the logic by a rule that if z), . . . ,(pi{vi, z) are for- 

mulae with Vi being an mi-tuple of first-sort variables, then f)(x,y,z) = 
I^[mi, . . . . . . , ipfivi, z)) is a formula with x and y being 

/c-tuples of fresh free variables of the first sort. The semantics is that for each A 
and c, one defines a new structure on A in which the Ah predicate of arity mi is 
interpreted as {u G A"*’ | A (pi{u,c)}. Then A \= tp{a,b,c) if in this structure 
the d-neighborhoods of a and b are isomorphic. However, this generalization does 
not preserve locality. 

Proposition 2. Adding I^[mi, . . . , m/] to £^^(C) violates Hanf -locality. In 
fact, with addition of I}[2] to FO one can define properties that are neither 
Hanf-local nor Gaif man-local. □ 



4 Characterizing Hanf-Local Properties 

We have seen that the logic £^^(C) fails to capture Hanf-local properties over 
arbitrary finite structures. To fill the gap between £^^(C) and Hanf-locality, 
we introduce the notion of local second-order quantification. The idea is similar 
to local first-order quantification which restricts quantified variables to fixed 
radius neighborhoods of free variables. This kind of quantification was used in 
Gaifman’s locality theorem jSj as well as in translations of various modal logics 
into fragments of FO |32S|. 

Definition 4. Fix r > 0 and a relational signature a. Suppose that we have, 
for every arity k > 0, a countably infinite set of k-ary relational symbols T^, 
i S N, disjoint from a. Define a set of formulae T by starting with >CJotj(C) 
atomic formulae involving symbols from a as well as Tj^s, and closing under the 
formation rules of and the following rule: If ip{x,i) is a formula, y is a 

subtuple of X and d < r, then 

tpi{x,i) = Sd{y) ip{x,i) and tp 2 {x,i) = ^TJ;, G Sd{y) <p{x,t) 
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are formulae of rank rk{ip) + 1. We say that the symbol is bound in these 
formulae. 

We then define £SO"^^{C) over STRUCT[(t] as the set of all formulae in T 
of finite rank in which all occurrences of the symbols T^s are bound. The logic 
(local second-order with counting^ is defined as 1J^>q 

The semantics of the new construct is as follows. Given a a-structure A 
and an interpretation T for all the symbols T^s occurring freely in %fi, we have 
{A,T) ^ iff there exists a set T C Sd{b)'^, where b is the subtuple of a 

corresponding to if, such that {A,T,T) |= ip{a,i). For if 2 , one replaces ‘exists’ 
by ‘for all’. □ 

For example, the formula 



/ Vy G Sr{x) {T{y) A ^T'{y)) V (-T(y) A T'{y)) \ 
3x3T G Sr{x)3T' G Sr{x) I A\/z,v {T{z) A E{z,v) ^ 

\ r{v))A{r{z)AE{z,v)^T{v)) J 

tests if there is a 2-colorable r-neighborhood of a node in a graph. Note that 
local first-order quantification Vy G Sr{x) is definable in FO for every fixed r. 
Our main result can now be stated as follows. 

Theorem 7. An m-ary query Q, m > 0, is Hanf-local iff it is definable by a 
formula of (without free second-sort variables). 

Proof sketch. We first show that queries definable in are Hanf-local. 

The same argument as in m shows that counting terms can be eliminated from 
without increasing the rank of a formula. Suppose we are given a 
signature a' disjoint from tr. If ^ G STRUCT [tr], a is a /c-tuple of elements of A, 
and C is an interpretation of a' predicates as relations of appropriate arity over 
A, we write {A, C, a) for the corresponding structure in the language of cr U cr' 
union constants for elements of a. By adom(C) we mean the active domain of 
C, that is, the set of all elements of A that occur in relations from C . We then 
write, for d > r, 

{A,c,d) (B,D,b) 

if D interprets tr' over B, a, b are of the same length, and the following three 
conditions hold: (1) (yl, 5); (2) adom{C) C S:^{d) and adom{D) C 

Sf{b)-, and (3) there exists an isomorphism h : N^{a) — > N^{b) such that 
h{C) = D. The if direction is now implied by the lemma below, simply by 
taking cr' to be empty. 

Lemma 1. Let ip{x,i,X) be a £SOf,^^{C) formula. Then there exists a number 
d > r such that, for every interpretation if of i, it is the case that {A, a, C) 

{B, b, D) implies 



A \= (fi{d, iQ, C) iff B\= (fi{b, to, D). 
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Proof of the lemma is by induction on formulae. Let rko^ip) be defined as 
rk{ip) but without taking into account second-order quantification (in particular, 
tko(v?) < We show that d can be taken to be 9"*r -|- ^ where m = 

rko{(f). 

The case requiring most work is that of counting quantifiers; that is, of a 
formula ip{x,i,X) = 3iz ip{x,z,i,X). Applying the hypothesis to (p, we ob- 
tain a number d > r such that for every To, (A, a, 0,(7) {B,b,e,D) im- 

plies that A 1= ip{d,c,iehC) iff ;B ^ p{b,e,io^D). To conclude, we must prove 
that (A, a, (7) ~g(j _|_4 (B,b,D) implies that A ^ if{d,io^C) iff K ^ if(b,io,D). 
For this, it suffices to establish a bijection f : A ^ B such that for every c, 
(A,d,c,C) {B,b, f{c), D) - then clearly the number of elements satisfying 

(fi will be preserved. This proof of this is based on the following combinato- 
rial lemma: Assume that {A, a)^g^_^_^{B,b)i and h is an arbitrary isomorphism 
.^^+4(0) ^ Then there exists a bijection f : A ^ B such that on 

S' 6 d+ 3 (a) it coincides with h, and (A, ac)^^(,B, 6/(c)) for every c G A. 

To prove the only i/part, we show that with local second-order quantification, 
one can define local orderings on neighborhoods, and then the counting power of 
£^^(C) allows one to code neighborhoods with numbers. The construction can 
be carried out in such a way that the entire multiset of isomorphism types of 
neighborhoods in a structure is coded by a formula whose rank is only determined 
by the radius of neighborhoods and the signature cr. Using this, one can express 
any Hanf- local query in □ 

There are several corollaries to the proof. First notice that if we defined 
without increasing the rank of a formula for every second-order 
local quantifier, the proof would go through verbatim. We can also define a 
logic just as £SO'^^{C) except that first-order local quantification 

3z G Sr(x) and Vz G Sr{x) is used in place of second-order local quantifiers, 
and those local quantifiers do not increase the rank (in particular, the depth 
of their nesting can be infinite, which allows one to define arbitrary computa- 
tions on those neighborhoods). Let then LJ^,^(C) be Ur (^)- The proof of 
Hanf-locality of L^^(C) goes through as before, and proving that every Hanf- 
local query is definable in (C) is very similar to that of £SO’^^{C) as with 
infinitely many local first-order quantifiers we can write out diagrams of neigh- 
borhoods. We thus obtain: 

Corollary 3. The following have the same expressive power as /197J^;^(C) (and 
thus capture Hanf-local properties): 

— the logic obtained from £SO(^^{C) by allowing the depth of nesting of local 

quantifiers to be infinite, and 

— the logic LJ^^^(C). □ 

Analyzing the proof of Theorem^, we also obtain the following normal form 
for /197^^(C) formulae, which shows that the depth of nesting of local second- 
order quantifiers need not exceed 1. 
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Corollary 4. Every CSO^^^iC) formula ip{x) is equivalent to a formula in the 
form 

\f /\{nij = #y.(35' C Sd{x) 'ip,j{x,y, S))) 

i 3 

where the conjunctions are finite, S is binary, and each ifij is a £^^(C) formula. 

As a final remark, we note that CSD%^^{C) is strictly more expressive than 
£^^(C) extended with tests for neighborhood isomorphisms. 

Proposition 3. Ud>o(^-JC) + {/^ \ k > 0}) C □ 

5 Characterizing Gaifman-Local Properties 

We now turn to Gaifman’s notion of locality, which states that a query Q is 
local with lr(Q) < r if Nf^{di) = N;f^{d 2 ) implies that oi G Q(A) iff 3,2 G Q{A). 
For structures of bounded valence, this notion was characterized by first-order 
definition by cases. An extended version of this notion captures Gaifman-locality 
in the general case. 

Definition 5. An m-ary query, m > 0, on STRUGT [cr] is given by a Hanf- local 
definition by cases if there exists a finite or countable partition o/ STRUGT[( t] 
into classes Ci, f G N, a number d > 0, and Hanf-local queries Qi, i G N, 
with hlr(Qi) < d, such that for every i and every A G Ci, it is the case that 

Q{A) = QiiA). 

Theorem 8. A query is Gaifman-local iff it is given by a Hanf-local definition 
by cases. 

Proof sketch. Assume that Q is given by a Hanf-local definition by cases. Let 
d be an upper bound on hlr(Qi). Then Q is Gaifman-local and lr(Q) < 3d -I- 1. 
Fix A, and assume A G C. Let di 02 . Then by HH, (A, di)^d(A, 02 ), 

and Hanf- locality of Qi implies di G Qi{A) = Q{A) iff 02 G Qi{A) = Q{A). 
Gonversely, let a Gaifman-local Q be given, with \r{Q) = d. Let ti,T 2 . . . be an 
enumeration of isomorphism types of finite cr-structures. Let Ci be the class of 
structures of type r^. We define Qi as follows: b G Qi{B) iff there exists A of type 
Ti and a G A™ such that {B, b)^j^{A, a) and a G Q{A). One then shows that each 
Qi is Hanf-local, with hlr(Qi) < d, and for every A of type n, Q{A) = Qi{A). □ 

Unlike in Fact El the number of cases in a Hanf-local definition by cases 
can be infinite. A natural question to ask is whether a finite number of cases 
is sufficient (in particular, whether the statement of Fact 0 holds for arbitrary 
finite structures). We now show that the infinite number of cases is unavoidable. 
In fact, we show a stronger result. 

Definition 6. For k > 0, let Locals be the class of queries given by a Hanf- 
local definition by cases, where the number of cases is at most k. Let Local* be 
IJ^^pLoCALfc, and G_Local be the class of all Gaifman-local queries. 
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Note that Locali is precisely the class of Hanf-local queries. 
Theorem 9. The hierarchy 

Locali c L 0 CAL 2 c . . . c Local* c G_Local 



is strict. 

Proof sketch. We first exhibit a query Q G Local;+i — LOCAL;. Intuitively, 
a query from LOCAL; cannot make I + 1 choices, and thus is different from 
every query in Local;_|_i on some class of the partition. More precisely, we 
define a class , 1 < f < ^ + 1 to be the class of graphs with the number 
of connected components being i — 1 modulo I + 1. Let be a FO-definable 
query returning the set of nodes reachable by a path of length i — 1 from a node 
of indegree 0. Form the query Q that coincides with on (Note that 
Q is not FO, as the classes are not FO-definable.) From Theorem El this 
is a Gaifman-local query, and it belongs to Local;_|_i. Suppose Q is in LOCAL;; 
that is, there is a partition of the class of all finite graphs into I classes €[,...,€[ 
and Hanf-local queries Q[ such that on C', Q coincides with Q', i = 1, . . . , Z. Let 
d = l-|-max hlr(Q'). Let Go be a successor relation on Z-l-1 nodes. Define a graph 
as the union of i cycles with nodes each, i = 1, . . . , Z -|- 1. As the 

total number of nodes in each is (Z -|- l)!(2cZ -I- 1) and all cZ-neighborhoods 
are isomorphic, we have for all z,j < Z -|- 1. Let now G(+i be the 

disjoint union of Go and , i = 1, . . . , Z -|- 1. By pigeonhole, there exists a 
class C'f. and i j,i,j < Z -|- 1 such that G-'*'^, G^'*'^ G C(,. We then show that 
Q cannot give correct results on both G(~''^ and G^’*'^. The separation G_Local 
from Local* is proved by a minor modification of the construction above. □ 

Thus, similarly to the case of Hanf-local queries, the characterization for 
structures of bounded valence fails to extend to the class of all finite structures. 

Corollary 5. There exist Gaifman-local queries that cannot be given by first- 
order definition by cases. □ 

6 Conclusion 

Notions of locality have been used in logic numerous times. The local nature of 
first-order logic is particularly transparent when one deals with fragments cor- 
responding to various modal logics; in general, Gaifman’s and Hanf’s theorems 
state that FO can only express local properties. These theorems were general- 
ized, and, being applicable to finite structures, they found applications in areas 
such as complexity and databases. 

However, while more and more powerful logics were proved to be local, there 
was no clear understanding of what kind of mechanisms can be added to logics 
while preserving locality. Here we answered this question by providing logical 
characterizations of local properties on finite structures. For Hanf-locality, ar- 
bitrary counting power and arbitrary computations over small neighborhoods 
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and can be added to first-order logic while retaining locality; moreover, with a 
limited form of infinitary connectives, such a logic captures all Hanf-local prop- 
erties. For Gaifman-locality, one can in addition permit definition by cases, and 
the number of cases be either finite or infinite. 
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Abstract. Motivated by description logics, we investigate what happens 
to the complexity of modal satisfiability problems if we only allow for- 
mulas built from literals. A, O, and □. Previously, the only known result 
was that the complexity of the satisfiability problem for K dropped from 
PSPACE-complete to coNP-complete (Schmidt-Schauss and Smolka |S] 
and Donini et al. P|). In this paper we show that not all modal logics 
behave like K. In particular, we show that the complexity of the satisfi- 
ability problem with respect to frames in which each world has at least 
one successor drops from PSPACE-complete to P, but that in contrast 
the satisfiability problem with respect to the class of frames in which 
each world has at most two successors remains PSPACE-complete. As a 
corollary of the latter result, we also solve the open problem from Donini 
et al.’s complexity classification of description logics |2|. In the last sec- 
tion, we classify the complexity of the satisfiability problem for K for all 
other restrictions on the set of operators. 



1 Introduction 

Since consistent normal modal logics contain propositional logic, the satisfiability 
problems for all these logics are automatically NP-hard. In fact, as shown by 
Ladner m, many of them are even PSPACE-hard. 

But we don’t always need all of propositional logic. For example, in some ap- 
plications we may use only a finite set of propositional variables. Propositional 
satisfiability thus restricted is in P, and, as shown by Halpern ^ , the complexity 
of satisfiability problems for some modal logics restricted in the same way also de- 
creases. For example, the complexity of S5 satisfiability drops from NP-complete 
to P. On the other hand, K satisfiability remains PSPACE-complete. The same 
restriction for linear temporal logics was studied in Demri and Schnoebelen Q. 

Restricting the number of propositional variables is not the only proposi- 
tional restriction on modal logics that occurs in the literature. For example, the 
description logic ACE can be viewed as multi-modal K where the formulas are 
built from literals, A, Os, and Ds. 

As in the case of a fixed number of propositional variables, satisfiability for 
propositional logic for formulas built from literals and A is easily seen to be in P. 
After all, in that case every propositional formula is the conjunction of literals. 

* Supported in part by grant NSF-INT-9815095. Work done in part while visiting the 
University of Amsterdam. 
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Such a formula is satisfiable if and only if there is no propositional variable p 
such that both p and p are conjuncts of the formula. 

Hence, satisfiability for modal logics for formulas built from literals, A, □, 
and O is not automatically NP-hard. Of course, it does not necessarily follow 
that the complexity of modal satisfiability problems will drop significantly. The 
only result that was previously known is that the complexity of K satisfiability 
(i.e., satisfiability with respect to the class of all frames) drops from PSPACE- 
complete to coNP-complete. The upper bound was shown by Schmidt-Schauss 
and Smolka |^, and the lower bound by Donini et al. |^. It should be noted 
that these results were shown in the context of description logics (a.k.a. concept 
languages), so that the notation in these papers is quite different from oursQ 
In addition, their language contains the constants true and false. However, it is 
easy to simulate these constants by propositional variables. See the full version 
of this paper for details |S| . 

In this paper we investigate if it is always the case that the complexity of the 
satisfiability problem decreases if we only look at formulas that are built from 
literals. A, O, and □, and if so, if there are upper or lower bounds on the amount 
that the complexity drops. 

We will show that not all logics behave like K. Far from it, by looking at 
simple restrictions on the number of successors that are allowed for each world 
in a frame, we obtain different levels of complexity, making apparent a subtle 
interplay between frames and operators. In particular, we will show that 

1. The complexity of the satisfiability problem with respect to linear frames 
drops from NP-complete to P. 

2. The complexity of the satisfiability problem with respect to remains 

NP-complete. 

3. The complexity of the satisfiability problem with respect to frames in which 
every world has at least one successor drops from PSPACE-complete to P. 

4. The complexity of the satisfiability problem with respect to frames in which 
every world has at most two successors remains PSPACE-complete. 

As a corollary of the last result, we also solve the open problem from Donini et 
al.’s complexity classification of description logics [21. 

In the last section, we completely classify the complexity of the satisfiability 
problem (with respect to the class of all frames) for all possible restrictions on 
the set of operators allowed, to gain more insight in the sources of complexity 
for modal logics. It turns out that the restriction studied in this paper, which 
we will call poor man’s logic, is the only fragment whose satisfiability problem 

^ Certain description logics can be viewed as syntactic variations of modal logics in 
the following way: the universal concept corresponds to true, the empty concept 
corresponds to false, atomic concepts correspond to propositional variables, atomic 
negation corresponds to propositional negation, complementation corresponds to 
negation, intersection corresponds to conjunction, union corresponds to disjunction, 
universal role quantifications correspond to □ operators, and existential role quan- 
tifications correspond to O operators [Jj. 



232 Edith Hemaspaandra 



is so unusual. For all other restrictions, the satisfiability problems are PSPACE- 
complete, NP-complete, or in P. These are exactly the complexity classes that 
one would expect to show up in this context. 

2 Definitions 

We will first briefiy review syntax, Kripke semantics, and some basic terminology 
for modal logic. 



Syntax 

The set of L formulas is inductively defined as follows. (As usual, we assume 
that we have a countably infinite set of propositional variables.) 

— p and p are C formulas for every propositional variable p, 

— if <j) and i/' C formulas, then so are <j) /\ ip and pM ip, and 

— if (/) is an £ formula, then np and Op are C formulas. 

We will identify p with p. 

The modal depth of a formula p (denoted by md{p)) is the depth of nesting 
of the modal operators □ and O. 



Semantics 

A frame is a tuple F = <W,R> where VF is a non-empty set of possible worlds, 
and i? is a binary relation on W called the accessibility relation. 

A model is of the form M = <W, R,tt> such that <W,R> is & frame (we say 
that M is based on this frame), and tt is a function from the set of propositional 
variables to Pow{W)-. a valuation, i.e., 7r(p) is the set of worlds in which p is 
true. For p dm L formula, we will write M, w \= p for p is true /satisfied at w in 
M . The truth relation ^ is defined with induction on p in the following way. 

— M, w \= p iff w G 7t(p) for p a propositional variable. 

— M, w \=p iff w / 7t(p) for p a propositional variable. 

— M, w \= p Ap iff M, w \= p and M , w \= p. 

— M, w \= py p iE M, w \= p or M, w \= p. 

~ M,w \= Up iff \/w' G W[wRw' ^ M,w' \= p]. 

— M,w \= Op iff 3w' G W[wRw' and M, w' \= p]. 

The size of a model or frame is the number of worlds in the model or frame. 

The notion of satisfiability can be extended to models, frames, and classes 
of frames in the following way. p is satisfied in model M if M, w \= p for some 
world w in M, p is satisfiable in frame F {F satisfiable) if p is satisfied in M for 
some model M based on F, and p is satisfiable with respect to class of frames 
T {/F satisfiable) if p is satisfiable in some frame F G F. 

As is usual, we will look at satisfiability with respect to classes of frames. For 
a class of frames F, the satisfiability problem with respect to F is the problem 
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of determining, given an C formula (j), whether ^ is IF satisfiable. For a complete 
logic L, we will sometimes view L as the class of frames where L is valid. For 
example, we will speak of K satisfiability when we mean satisfiability with respect 
to all frames. Likewise, we will on occasion identify a class of frames with its 
logic, i.e., with the set of formulas valid on this class of frames. 



Poor Man’s Logic 

The set of poor man’s formulas is the set of C formulas that do not contain 
V. The poor man’s satisfiability problem with respect to T is the problem of 
determining, given a poor man’s formula 4>, whether (j) is !F satisfiable. 

In poor man’s language, we will view A as a multi-arity operator, and we 
will assume that all conjunctions are “flattened,” that is, a conjunct will not 
be a conjunction. Thus, a formula (j) in this language is of the following form: 
(j) = Utpi A • • • A □V’fc A 0^1 A • • • A O^m A £i A ■ ■ ■ A£s, where the £iS are literals. 

In all but the last section of this paper, we will compare the complexity of 
satisfiability to the complexity of poor man’s satisfiability with respect to the 
same class of frames. We are interested in simple restrictions on the number of 
successor worlds that are allowed. Let !F<i, £F< 2 , £F>i be the classes of frames 
in which every world has at most one, at most two, and at least one successor, 
respectively. 



3 Poor Man’s Versions of NP-Complete Satisfiability 
Problems 

We already know that the poor man’s version of an NP-complete modal satis- 
fiability problem can be in P. Look for example at satisfiability with respect to 
the class of frames where no world has a successor. This is plain propositional 
logic in disguise, and it inherits the complexity behavior of propositional logic. 
As mentioned in the introduction, the complexity of satisfiability drops from 
NP-complete to P. 

In this section, we will give an example of a non-trivial modal logic with the 
same behavior. We will show that the poor man’s version of satisfiability with 
respect to linear frames is in P. In contrast, we will also give a very simple ex- 
ample of a modal logic where the complexity of poor man’s satisfiability remains 
NP-complete. 

Theorem 1. Satisfiability with respect to T<\ is NP-complete and poor man’s 
satisfiability with respect to T<\ is in P. 

Proof. Clearly, £F<i satisfiability is in NP (and thus NP-complete), since every 
satisfiable formula is satisfiable on a linear frame with < md{4>) worlds, where 
md{(j>) is the modal depth of (f>. This immediately gives the following NP algo- 
rithm for lF<i satisfiability: Guess a linear frame of size < md^f), and for every 
world in the frame, guess a valuation on the propositional variables that occur 
in (p. Accept if and only if the guessed model satisfies <p. 
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It is easy to see that the following polynomial-time algorithm decides poor 
man’s satisfiability with respect to T<\. Let (f) = □i/'i A • • • A □V’fc A A • • • A 
A A • • • A where the £jS are literals, (j) is ^<i satisfiable if and only if 

— A • • • A is satisfiable (that is, for all i and j, £i yf £j), and 

— t rn = 0, (that is, (j) does not contain conjuncts of the form O^, in which 

case the formula is satisfied in a world with no successors), or 
• Afci Ai A a is -^<1 satisfiable (the world has exactly one successor). 

□ 

From the previous example, you might think that the poor man’s versions of 
logics with the poly-size frame property are in P, or even that the poor man’s 
versions of all NP-complete satisfiability problems are in P. Not so. The following 
theorem gives a very simple counterexample. 

Theorem 2. Satisfiability and poor man’s satisfiability with respect to the frame 
are NP-complete. 

Proof. Because the frame is finite, both satisfiability problems are in NP. Thus 

it suffices to show that poor man’s satisfiability with respect to is NP-hard. 

Since we are working with a fragment of propositional modal logic, it is 
extremely tempting to try to reduce from an NP-complete propositional satis- 
fiability problem. However, because poor man’s logics contain only a fragment 
of propositional logic, these logics don’t behave like propositional logic at all. 
Because of this, propositional satisfiability problems are not the best choice of 
problems to reduce from. In fact, they are particularly confusing. 

It turns out that it is much easier to reduce a partitioning problem to our 
poor man’s satisfiability problem. We will reduce from the following well-known 
NP-complete problem. 

GRAPH 3-COLORABILITY: Given an undirected graph G, can you 
color every vertex of the graph using only three colors in such a way 
that vertices connected by an edge have different colors? 

Suppose G = {V, E) where V = {1,2,..., n}. We introduce a propositional 

variable Pe for every edge e. The three leaves of will correspond to the 

three colors. To ensure that adjacent vertices in the graph end up in different 
leaves, we will make sure that the smaller endpoint of e satisfies Pe and that the 
larger endpoint of e satisfies p^. 

The requirements for vertex i are given by the following formula: 

V’i = /\{Pe I e = {i,j} and i < j} A f\{p^ \ e = {i,j} and i > j}. 

Define /(G) = AAi 

/ is clearly computable in polynomial-time. To show that / is indeed a reduc- 
tion from GRAPH 3-GOLORABILITY to poor man’s satisfiability with respect 
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to , first note that it is easy to see that for every set V C 1/, the following 
holds: ^/’i is satisfiable if and only if no two vertices in V are connected 

by an edge. 

• 

It follows that /(G) = Ar=i is satisfiable on if and only if there 

exist sets of vertices and V 3 such that V = 1/ U V 2 U V 3 and Aigy, A is 

satisfiable for j € {1,2,3}. This holds if and only if there exist sets of vertices 
V\,V 2 , and V 3 such that V = 1/ U V 2 U lA and no two vertices in V) are adjacent 
for j G {1,2,3}, which is the case if and only if G is 3-colorable. (We obtain a 
coloring by coloring each vertex v by the smallest j such that v G Vj.) 

□ 



4 Poor Man’s Versions of PSPACE-Complete 
Satisfiability Problems 

It is well-known that the satisfiability problems for many modal logics including 
K are PSPACE-complete 0. We also know that poor man’s satisfiability for K 
is coNP-complete |I8I3I . That is, in that particular case the complexity of the 
satisfiability problem drops from PSPACE-complete to coNP-complete. Is this 
the general pattern? We will show that this is not the case. We will give an 
example of a logic where the complexity of the satisfiability problem drops from 
PSPACE-complete all the way down to P, and another example in which the 
complexity of both the satisfiability and the poor man’s satisfiability problems 
are PSPACE-complete. Both examples are really close to K; they are satisfiability 
with respect to T>i and T< 2 , respectively. 

We will first consider iF>i. This logic is very close to K and it should come 
as no surprise that the complexity of satisfiability and K satisfiability are 
the same. It may come as a surprise to learn that poor man’s satisfiability with 
respect to !F>i is in P. It is easy to show that poor man’s satisfiability with 
respect to is in coNP, because the following function / reduces the poor 
man’s satisfiability problem with respect to J->i to the poor man’s satisfiability 
problem for K. 

/(<(.) = </. A /\ a^Oq, 

i=0 

where g is a propositional variable not in (j). The formula ensures that every 
world in the relevant part of the K frame has at least one successor. 

It is very surprising that poor man’s satisfiability with respect to is in P, 
because the relevant part of the J->i frame may require an exponential number 
of worlds to satisfy a formula in poor man’s language. For example, consider the 
following formula: 



ODDpi A ODDpi A n(Onp2 A <>Up2) A □□(Opa A Opa). 
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If this formula is satisfiable in world w, then for every assignment to pi,p 2 , 
and ps, there exists a world reachable in three steps from w that satisfies that 
assignment. 

In its general version, the formula becomes 

n 

4>asg = f\ A 

i=l 

The formula is of length polynomial in n and forces the relevant part of the 
model to be of exponential size. 

Now that we have seen how surprising it is that poor man’s satisfiability with 
respect to lF>i is in P, let’s prove it. 

Theorem 3. Satisfiability with respect to T>\ is P SPACE- complete and poor 
man’s satisfiability with respect to P>\ is in P. 

Proof. The proof that satisfiability with respect to E>i is PSPACE-complete is 
very close to the proof that K satisfiability is PSPACE-complete and therefore 
omitted. 

For the poor man’s satisfiability problem, note that a simplified version of 
Ladner’s PSPACE upper bound construction for K can be used to show the 
following. 

Let (/) = Oifi A • • • A A 0^1 A • • • A A ii A ■ ■ ■ A is, where the iiS are 
literals, f is lF>i satisfiable if and only if 

1. .^1 A • • • A is satisfiable, 

2. for all j, tfi A ■ ■ ■ A tpk A is E>i satisfiable, and 

3. "01 A • • • A 'i/'fe is E>i satisfiable. (only relevant when m = 0.) 

Note that this algorithm takes exponential time and polynomial space. Of 
course, we already know that poor man’s satisfiability with respect to iF>i is in 
PSPACE, since satisfiability with respect to lF>i is in PSPACE. How can this 
PSPACE algorithm help to prove that poor man’s satisfiability with respect to 
E>i is in P? 

Something really surprising happens here. We will prove that for every poor 
man’s formula 4>, 4> is satisfiable if and only if (the conjunction of) every 
pair of (not necessary different) conjuncts of cj) is T>\ satisfiable. Using dynamic 
programming, we can compute all pairs of subformulas of (f> that are T>i sat- 
isfiable in polynomial-time. This proves the theorem. It remains to show that 
for every poor man’s formula (j) is satisfiable if and only if every pair of 
conjuncts of (/> is satisfiable. We will prove this claim by induction on md{(j)), 
the modal depth of (j). In the proof, we will write “satisfiable” for “satisfiable 
with respect to T>i.” 

If md{(j>) = 0, 0 is a conjunction of literals. In that case (p is not satisfiable if 
and only if there exist i and j such that £i = ij. This immediately implies our 
claim. 

For the induction step, suppose p = A • • • A □-i/'fe A A • • • A Ai\A 
■ ■ ■ Ais (where the Cs are literals), md{(j)) > 1, and suppose that our claim holds 
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for all formulas of modal depth < md{4>). Suppose for a contradiction that (j) is 
not satisfiable, though every pair of conjuncts of 4> is satisfiable. Then, by the 
Ladner-like construction given above, we are in one of the following three cases: 

1. £i A • • • A £s is not satisfiable, 

2. for some j, tpi A ■■■ A ipk ^ is not satisfiable, or 

3. "01 A • • • A 'i/'fe is not satisfiable. 

By induction, it follows immediately that we are in one of the following four 
cases: 

1. There exist i,i' such that £i A £i' is not satisfiable, 

2. there exist i^i' such that ■i/'i A ■i/l’i' is not satisfiable, 

3. there exist i, j such that tpi A is not satisfiable, or 

4. there exists a j such that A is not satisfiable. 

If we are in case 2, A □■0^/ is not satisfiable. In case 3, A O^j is not 
satisfiable. In case 4, O^j A <>^j is not satisfiable. So in each case we have found 
a pair of conjuncts of (j) that is not satisfiable, which contradicts the assumption. 
□ 



Why doesn’t the same construction work for K? It is easy enough to come up 
with a counterexample. For example, {Dp, Dp, Oq} is not satisfiable, even though 
every pair is satisfiable. The deeper reason is that we have some freedom in K 
that we don’t have in J->i. Namely, on a K frame a world can have successors 
or no successors. This little bit of extra freedom is enough to encode coNP in 
poor man’s language. 

Theorem 121 showed that poor man’s satisfiability can be as hard as satisfia- 
bility for NP-complete logics. In light of the fact that poor man’s satisfiability 
for K is coNP-complete and poor man’s satisfiability with respect to J->i is even 
in P, you might wonder if the complexity of PSPACE-complete logics always 
decreases. 

To try to keep the complexity as high as possible, it makes sense to look 
at frames in which each world has a restricted number of successors, as in the 
construction of Theoremj^ Because we want the logic to be PSPACE-complete, 
we also need to make sure that the frames can simulate binary trees. The obvious 
class of frames to look at is !F <2 - the class of frames in which each world has 
at most two successors. This gives us the desired example. 

Theorem 4. Satisfiability and poor man’s satisfiability with respect to J -<2 o,re 
PSPACE-complete. 

Proof. Satisfiability with respect to E <2 is PSPACE-complete by pretty much 
the same proof as the PSPACE-completeness proof for K [S|. To show that the 
poor man’s version remains PSPACE-complete, first note that a formula is J -<2 
satisfiable if and only if it is satisfiable in the root of a binary tree. Stockmeyer ^ 
showed that the set of true quantified 3CNF formulas is PSPACE-complete. 
Using padding, it is immediate that the following variation of this set is also 
PSPACE-complete. 
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QUANTIFIED 3SAT: Given a quantified Boolean formula 

3piVp23p3 • • • 3pn-i^Pn4>, where ^ is a propositional formula over 

Pi, . . . ,Pn in 3CNF (that is, a formula in conjunctive normal form with 

exactly 3 literals per clause), is the formula true? 

We will reduce QUANTIFIED 3SAT to poor man’s satisfiability with respect 
to binary trees. To simulate the quantifiers, we need to go back to the formula 
that forces models to be of exponential size. 

n 

^asg = A A 

i=l 

4>asg is clearly satisfiable in the root of a binary tree and if 4>asg is satisfied in 
the root of a binary tree, the worlds of depth < n form a complete binary tree 
of depth n and every assignment to pi, ... ,pn occurs exactly once in a world at 
depth n. We will call the worlds at depth n the assignment- worlds. 

The assignment-worlds in a subtree rooted at a world at distance i < n 
from the root are constant with respect to the value of pi. It follows that 
3piVp23p3 • • • 3pn-iipn(j> G QUANTIFIED 3SAT if and only if (j>asg^ 
is satisfiable with respect to binary trees. 

This proves that satisfiability for 1F<2 is PSPACE-hard, but it does not prove 
that the poor man’s version is PSPACE-hard. Recall that 4> is in 3CNF and thus 
not a poor man’s formula. 

Below, we will show how to label all assignment-worlds where (j) does not 
hold by / (for false). It then suffices to add the conjunct (OD)”/^/ to obtain a 
reduction. 

How can we label all assignment-worlds where <j) does not hold by /? Let k 
be such that (p = if i A tp2 t\ ■■■ Aifk, where each ifi is the disjunction of exactly 3 
literals: ifi = ii\ V ia V £43. We assume without loss of generality that n is even 
and that each ifi contains 3 different propositional variables. 

For every i, we will label all assignment-worlds where ipi does not hold by /. 
Since ifi = £n ^ la \t lis, this implies that we have to label all assignment- worlds 
where In A la A la holds by /. In general, this cannot be done in poor man’s 
logic, but in this special case we are able to do it, because the relevant part of 
the model is completely fixed by 4>asg- 

As a warm-up, first consider how you would label all assignment-worlds where 
P3 holds by /. This is easy; add the conjunct 

□□od"-3(^a/). 

You can label all assignment-worlds where ps A ps holds as follows: 

□ □OnOD”-5(^Ap5A/). 

This can easily be generalized to a labeling for ps A P5 A ps'. 

□□ononnOD""®(^Ap5 A^A /). 
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Note that we can write the previous formula in the following suggestive way: 

□ 3-l^n5-3-l^n8-5-l^nn-8(— Apj, a^A /). 

In general, suppose you want to label all assignment-worlds where £i A ^2 A 
£3 hold by /, where £1, £2, and £3 are literals. Suppose that £1, £2, and £3’s 
propositional variables are Pa, Pb-, and pc, respectively. Also suppose that a < 
b < c. The labeling formula labeLfalse{ii A £2 A £3) is defined as follows. 

labeLfalseiei A £2 A £3) = A £2 A £3 A /). 

If labeLfalse{£i A £2 A £3) is satisfied in the root of a complete binary tree, 
then there exist at least 2"“^ worlds at depth n such that (£1 A£2 A£3 A /) holds. 

If 4 >asg is satisfied in the root of a binary tree, then the worlds of depth < n 
form a complete binary tree and there are exactly 2”“^ assignment-worlds such 
that (£1 A £2 A £3) holds. 

It follows that if (pasg is satisfied in the root of a binary tree, then 
labeLfalse{£i A £2 A £3) is satisfied in the root if and only if / holds in every 
assignment-world where (£1 A £2 A £3) holds. 

Thus, the following function 5 is a reduction from QUANTIFIED 3SAT to 
poor man’s satisfiability with respect to J-<2- 

k 

g(3piVp23p3 • • • 3 pn-i'^Pn(t>) = 4 >asg A !\ labeLfalse{i^i A £*2 A £*3) A 

i=l 

□ 

Why doesn’t the construction of Theorem 0 work for K? A formula that is 
satisfiable in a world with exactly two successors is also satisfiable in a world 
with more than two successors. Because of this, the labeLfalse formula will not 
necessarily label all assignment- worlds where (j) does not hold by /. For a very 
simple example, consider the formula 

<>p A <>p A <>{p A /) A <>{j) A /) A O/. 

This formula is not satisfiable, since both the p successor and the p successor 
are labeled /. However, this formula is satisfiable in a world with three successors, 
satisfying p A /, p A /, and /, respectively. 



5 A.C£J\f Satisfiability Is PSPACE-Complete 

In the introduction, we mentioned that poor man’s logic is closely related to cer- 
tain description logics. Donini et al. |2| almost completely characterize the com- 
plexity of the most common description logics. The only language they couldn’t 
completely characterize is AC£N . AC£N is ACE (the poor man’s version of 
multi-modal K) with number restrictions. Number restrictions are of the form 
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(<n) and (>n). (<n) is true if and only if a world has < n successors and (>n) 
is true if and only if a world has > n successors. 

In |2|, it was shown that ACSN satisfiability is in PSPACE, assuming that 
the number restrictions are given in unary. The best lower bound for satisfiabil- 
ity was the coNP lower bound that is immediate from the fact that this is an 
extension of ACE. 

We will use Theorem 0 to prove PSPACE-hardness for a very restricted 
version of ACEAf. 



Theorem 5. Satisfiability for the poor man’s version of K extended with the 
number restriction (<2) is PSPACE-hard. 

Proof. The reduction from poor man’s satisfiability with respect to iF <2 is obvi- 
ous. It suffices to use the number restriction (<2) to make sure that every world 
in the relevant part of the model has at most two successors. Let md{(j)) be the 
modal depth of (j>. All worlds that are of importance to the satisfiability of f are 
at most md{(j)) steps away from the root. The reduction is as follows: 



md(^) 

f{f) = fA /\ □*(<2) 



;=o 



□ 

Combining this with the PSPACE upper bound from P| completely charac- 
terizes the complexity of ACEAf satisfiability. 



Corollary 1. ACEAf satisfiability is PSPACE-complete. 



6 Other Restrictions on the Set of Operators 

As mentioned in the introduction, restricting the modal language in the way that 
we have, i.e., looking at formulas built from literals. A, □, and O, was motivated 
by the fact that this restriction occurs in description logics and also by the rather 
bizarre complexity behavior of this fragment. 

From a more technical point of view however, we might well wonder what 
happens to other restrictions on the set of operators allowed. After all, who is 
to say which sublanguages will be useful in the future? Also, we might hope to 
gain more insight in the sources of complexity for modal logics by looking at 
different sublanguages. 

For S C A, V, 0,0, true, false}, let C{S) denote the modal language 
whose formulas are built from an infinite set of propositional variables and op- 
erators from S. We will write “ for propositional negation, and ^ for general 
negation. So, our “old” language C will be denoted by £({“, A, V, □, O}), and 
poor man’s language by C{{~, A, □, O}). 

Completely characterizing the complexity of C{S) satisfiability (with respect 
to the class of all frames) for every S C A, V, □, O, true, false} may seem to 
be a daunting task, since there are 2® subsets to consider. But it turns out that 
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there are only four possibilities for the complexity of these satisfiability prob- 
lems: P, NP-complete, coNP-complete, and PSPACE-complete. Also, there are 
not many surprises: Languages that contain a complete basis for modal logic ob- 
viously have PSPACE-complete satisfiability problems, languages that contain a 
complete basis for propositional logic, but not for modal logic have NP-complete 
satisfiability problems, and poor man’s logic (with or without constants) is the 
only coNP-complete case. All other cases are in P, except for the one surprise 
that £({A, V, □, O, false}) satisfiability is PSPACE-complete. 

Due to space limitations, we refer the reader to the full version of this paper 
for the proofs |^. 
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Abstract. We present a fast deterministic algorithm for integer sort- 
ing in linear space. Our algorithm sorts n integers in linear space in 
0(n(log log n)^ ®) time. This improves the 0(n(log log n)^) time bound 
given in m- This result is obtained by combining our new technique with 
that of Thorup’s HU. The approach and technique we provide are totally 
different from previous approaches and techniques for the problem. As a 
consequence our technique can be extended to apply to nonconservative 
sorting and parallel sorting. Our nonconservative sorting algorithm sorts 
n integers in {0, 1, ..., m— 1} in time 0(n(log logn)^/ (log fc-|-log log log n)) 
using word length fclog(m -|- n), where k < logn. Our EREW parallel 
algorithm sorts n integers in {0,l,...,m — 1} in 0((logn)^) time and 
O (n (log log n)^/ log log logn) operations provided logm = 17((logn)^). 



1 Introduction 

Sorting is a classical problem which has been studied by many researchers. 
Although the complexity for comparison sorting is now well understood, the 
picture for integer sorting is still not clear. The only known lower bound for 
integer sorting is the trivial I7(n) bound. Recent advances in the design of al- 
gorithms for integers sorting have resulted in fast algorithms 0 jS] • However, 

these algorithms use randomization or superlinear space. For sorting integers in 
{0, 1, ..., TO— 1} 0(nrrf) space is used in the algorithms reported in P] jOj. When to 
is large (say to = 17(2")) the space used is excessive. Integer sorting using linear 
space is therefore extensively studied by researchers. An earlier work by Fredman 
and WillardPI shows that n integers can be sorted in 0{n log n/ log log n) time in 
linear space. Raman showed that sorting can be done in 0(n\/log n log log n) time 
in linear space pi)|. Later Andersson improved the time bound to 0(nvdogn)|2|. 
Then Thorup improved the time bound to 0(n(log log n)^) |l ij . In this paper 
we further improve upon previous results. We show that n integers can be sorted 
in 0(n(log log n)^'®) time in linear space. 

Unlike previous techniques our technique can be extended to 

apply to nonconservative sorting and parallel sorting. Conservative sorting is to 
sort n integers in {0, 1, ...to — 1} with word length (the number of bits in a word) 
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0(log(m + n)) |S1- Nonconservative sorting is to sort with word length larger 
than 0(log(m + n)). We show that n integers in {0, 1, m — 1} can be sorted 
in time 0(n(loglogn)^/(logA: + log log log n)) with word length A:log(m + n) 
where k < logn. Thus if fc = (logn)*^, 0 < e < 1, the sorting can be done in 
linear space and 0(n log logn) time. Anderssonj^] and Thoriir)|TT) did not show 
how to extend their linear space sorting algorithm to nonconservative sorting. 
Thorup HD used an algorithm to insert a batch of n integers into the search tree 
in 0(n log logn) time. When using word length fclog(m+n) this time complexity 
can be reduced to 0(n(log log n— log k)), thus yielding an 0{n log log n(log log n— 
logfc)) time algorithm for linear space sorting, which is considerably worse than 
our algorithm. 

Also note that previous results do not readily extend to parallel 

sorting. Our technique can be applied to obtain a more efficient parallel algorithm 
for integer sorting. In this regard the best previous result on the EREW PRAM is 
due to Han and ShenpJ which sorts n integers in {0, 1, ..., m— 1} in 0(log n) time 
and 0{riy/log n) operations(time processor product). We show when logm = 
l7((logn)^) we can sort in 0((logn)^) time with 0(n(loglogn)^/logloglogn) 
operations on the EREW PRAM. Thus for large integers our new algorithm is 
more efficient than the best previous algorithm. 



2 Preparation 

Word length is the number of bits in a word. For sorting n integers in the 
range {0,1, 2 ,..., to— 1} we assume that the word length used in our conser- 
vative algorithm is O (log (to + n)). The same assumption is made in previous 
designs 0 g| ^ ^ . In integer sorting we often pack several small integers into 
one word. We always assume that all the integers packed in a word use the same 
number of bits. Suppose k integers each having I bits are packed into one word. 
By using the test bit technique we can do a pairwise comparison of the cor- 
responding integers in two words and extract the larger integers into one word 
and smaller integers into another word in constant time. Therefore by adapting 
well-known selection algorithms (e.g. select median for every 5 elements, find the 
median a among the selected elements, use a to eliminate 1/4 of the elements 
and then recurse), we immediately have the following lemma: 

Lemma 1: Selecting the s-th largest integer among the n integers packed into 
n/k words can be done in 0(n log k/k) time and 0{n/k) space. In particular the 
median can be found in 0(n\ogk/k) time and 0{n/k) space. 

The factor log k in Lemma 1 comes from the fact that after a constant number 
of integers are eliminated we have to pack the integers into fewer number of 
words. This packing incurs the factor logk in the time complexity. 

Now consider sorting small integers. Let k integers be packed in one word. 
We say that the nk integers in n words are sorted if ki-th to (fc(z -|- 1) — l)-th 
smallest integers are sorted and packed in the z-th word, 0 < i < n. We have the 
following lemma: 
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Lemma 2: If A: = 2* integers using a total of (logn)/2 bits are packed into one 
word, then the nk integers in n words can be sorted in 0{nt) = 0{nlogk) time 
and 0(n) space. 

Proof: Because only (log n) /2 bits are used in each word to store k integers we 
can use bucket sorting to sort all words by treating each word as one integer 
and this takes 0{n) time and space. Because only (logn)/2 bits are used in each 
word there are only ^/n patterns for all the words. We then put k < (logn)/2 
words with the same pattern into one group. For each pattern there are at most 
k—1 words left which cannot form a group. Therefore at most ^/n- (k— 1) words 
cannot form groups. For each group we use the following algorithm to move the 
t-th integer in all k words into one word. 

Algorithm Transpose(s, Ag, Ai, A^g-i) 

/* s = (glogn)/(2fc). Let Ai be the z-th word, 0 < z < 2g. Let Bi and Ci, 
0 < z < fc be words used for temporary storage. Let Di be the constant which 
is when represented as a binary string. Let Ei be the constant 

which is I’s complement of Di, that is when represented as a 

binary string. AND and OR are bit-wise AND and OR operations. */ 
for z = 0 to g — 1 do 
begin 

Bi = Ai AND Ds-, 

Bi+g = Ai^g AND Dg] 

Bi = B,* 2 ®; 

Bi = B^ OR B^+g] 

Ci = A, AND Eg-, 

Ci+g = Aj_|_g AND Eg-, 

Ci = Ci/2^-, 

Ci = C, OR Ci+g-, 

Ai = Bi-, 

Ai+g Ci, 

end 

if g = 1 return; 

Call Transpose(s/2, Aq, Ai, ..., Ag_i); 

Call Transpose(s/2, Ag, Ag+i , ..., A 2 g-i)-, 

We invoke Transpose ( (log n)/4, Ag, Ai, ..., Ak-i) to transpose the integers in 
a group. This takes 0{k log k) time and 0{k) space for each group. Therefore for 
all groups it takes 0{nlog k) time and 0{n) space. For the words not in a group 
(there are at most -^n • (A: — 1) of them) we simply disassemble the words and 
then reassemble the words. This will take no more than 0{n) time and space. 
After all these are done we then use bucket sorting again to sort the n words. 
This will have all the integers sorted. □ 

Note that when k = O(logn) we are sorting 0(n log rz) integers packed in n 
words in 0(rz log log n) time and 0(n) space. Therefore the saving is considerable. 
Lemma 3: Assume that each word has logm > logn bits, that k integers each 
having (logm)/A: bits are packed into one word, that each integer has a label 
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containing (logn)/(2A:) bits, and that the k labels are packed into one word the 
same way as integers are packed into words (that is, if integer a is packed as the 
s-th integer in the t-th word then the label for a is packed as the s-th label in 
the t-th word for labels), then n integers in n/k words can be sorted by their 
labels in 0((n log log n)//c) time and 0{n/k) space. 

Proof: The words for labels can be sorted by bucket sorting because each word 
uses (logn)/2 bits. The sorting will group words for integers into groups as in 
Lemma 2. We can then call Transpose on each group of words for integers. □ 

Note also that the sorting algorithm given in Lemma 2 and Lemma 3 are not 
stable. As will be seen later we will use these algorithms to sort arbitrarily large 
integers. Even though we do not know how to make the algorithm in Lemma 2 
stable, as will be seen that our sorting algorithm for sorting large integers can 
be made stable by using the well known method of appending the address bits 
to each input integer. 

If we have larger word length the sorting can be done faster as shown in the 
following lemma. 

Lemma 4: Assume that each word has log m log log n > log n bits, that k inte- 
gers each having (logm)/fc bits are packed into one word, that each integer has 
a label containing (logn)/(2fc) bits, and that the k labels are packed into one 
word the same way as integers are packed into words, then n integers in n/k 
words can be sorted by their labels in 0{n/k) time and 0{n/k) space. 

Proof: Note that although word length is log m log log n only logm bits are 
used for storing packed integers. As in Lemmas 2 and 3 we sort the words 
containing packed labels by bucket sorting. Instead of putting k words into one 
group we put fcloglogn words into one group. To transpose the integers in a 
group containing k log log n words we first further pack k log log n words into k 
words by packing log log n words into one word. We then do transpose on the k 
words. Thus transpose takes only O(fcloglogn) time for each group and 0{n/k) 
time for all integers. After finishing transpose we then unpack the integers in 
the k words into fcloglogn words. □ 

Note also if the word length is log m log log n and only log m bits are used to 
pack k < log n integers into one word. Then the selection in Lemma 1 can be 
done in 0{n/k) time and space. 



3 The Approach and the Technique 



Consider the problem of sorting n integers in {0,l,...,m — 1}. We assume that 
each word has logm bits and that logm > log n log log n. Otherwise we can 
use radix sorting to sort in 0(n log log n) time and linear space. We divide the 
log m bits used for representing each integer into log n blocks. Each block thus 
contains at least log log n bits. The t-th block containing (z logm/ log n)-th to 
((z -I- 1) logm/ log n — l)-th bits. Bits are counted from the least significant bit 
starting at 0. We sort from high order bits to low order bits. We now propose a 
21ogn stage algorithm which works as follows. 
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In each stage we work on one block of bits. We call these blocks small integers 
because each small integer now contains only log m/ log n bits. Each integer is 
represented by and corresponds to a small integer which we are working on. 
Consider the 0-th stage which works on the most significant block (the (log n— 1)- 
th block). Assume that the bits in these small integers are packed into n/logn 
words with log n small integers packed into one word. For the moment we ignore 
the time needed for packing these small integers into n/ logn words and assume 
that this is done for free. By Lemma 1 we can find the median of these n small 
integers in 0(n log log n/logn) time and O (n/logn) space. Let a be the median 
found. Then n small integers can be divided into at most three sets Si, S 2 , and 
S 3 . Si contains small integers which are less than a. S 2 contains small integers 
which are equal to a. S 3 contains small integers which are greater than a. We 
also have |S'i| < n/2 and l^sl < n/2. Although |S' 2 | could be larger than n/2 
all small integers in S 2 are equal. Let S '2 be the set of integers whose most 
significant block is in S' 2 - Then we can eliminate log to/ logn bits (the most 
significant block) from each integer in S '2 from further consideration. Thus after 
one stage each integer is either in a set whose size is at most half of the size of 
the set at the beginning of the stage, or one block of bits (log to/ logn bits) of 
the integer can be eliminated from further computation. Because there are only 
logn blocks in each integer, each integer takes at most logn stages to eliminate 
blocks of bits. An integer can be put in a half sized set for at most logn times. 
Therefore after 2 logn stages all integers are sorted. Because in each stage we 
are dealing with only n/logn words, if we ignore the time needed for packing 
small integers into words and for moving small integers to the right set then the 
remaining time and space complexity will be 0(n log logn) because there are 
only 2 log n stages. 

The subtle part of the algorithm is how to move small integers into the set 
where the corresponding integer belongs after previous set dividing operations 
of our algorithm. Suppose that n integers have already been divided into k sets. 
Also assume that (logn)/(21og/c) small integers each containing log/c continu- 
ous blocks of an integer are packed into one word. For each small integer we 
use a label of log k bits indicating which set it belongs. Assume that the labels 
are also packed into words the same way as the small integers are packed into 
words with (log n)/(2 log fc) labels packed into one words. Thus if small integer 
a is packed as the s-th small integer in the t-th word then the label for a is 
packed as the s-th label in the t-th word for labels. Note that we cannot disas- 
semble the small integers from the words and then move them because this will 
incur 0{n) time. Because each word for labels contains (logn)/(21ogfc) labels 
therefore only (logn)/2 bits are used for each such word. Thus Lemma 3 can 
be applied here to move the small integers into the sets they belong to. Because 
only 0((n log A:)/ logn) words are used the time complexity for moving small 
integers to their sets is 0((n log logn log A:) /logn). 

Note that 0(log k) blocks for each small integer is the most number of bits 
we can move in applying Lemma 3 because each word has logm bits. Note also 
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that the moving process is not stable as the sorting algorithm in Lemma 3 is not 
stable. 

With such a moving scheme we immediately face the following problem. If 
integer a is the fc-th member of a set S. That is, a block of a (call it a') is listed 
as the k-th (small) integer in S. When we use the above scheme to move the 
next several blocks of a (call it a”) into S, a” is merely moved into a position 
in set S, but not necessarily to the fc-th position (the position where a' locates). 
If the value of the block for a' is identical for all integers in S that does not 
create problem because that block is identical no matter which position in S 
a” is moved to. If the value of the block for a' is not identical for all integers 
in S then we have problem continue the sorting process. What we do is the 
following. At each stage the integers in one set works on a common block which 
is called the current block of the set. The blocks which proceed the current block 
contain more significant bits of the integer and are identical for all integers in 
the set. When we are moving more bits into the set we move the following blocks 
together with the current block into the set. That is, in the above moving process 
we assume the most significant block among the log/c continuous blocks is the 
current block. Thus after we move these log k blocks into the set we delete the 
original current block because we know that the log k blocks are moved into the 
correct set and that where the original current block locates is not important 
because that current block is contained in the logfc blocks. 

Another problem we have to pay attention is that the size of the sets after 
several stages of dividing will become small. The scheme of Lemmas 2 and 3 
relies on the fact that the size of the set is not very small. We cope with this 
problem in this way. If the size of the set is larger than ^/n we keep dividing the 
set. In this case each word for packing the labels can use at least (logn)/4 bits. 
When the size of the set is no larger than -y/n we then use a recursion to sort 
the set. In each next level of recursion each word for packing the labels uses less 
number of bits. The recursion has O(loglogn) levels. 

Below is our sorting algorithm which is used to sort integers into sets of size 
no larger than ^/n. This algorithm uses yet another recursion (do not confuse 
this recursion with the recursion mentioned in the above paragraph). 
Algorithm Sort{level, uq, oi, ..., at) 

/* Oi’s are the input integers in a set to be sorted, level is the recursion level. */ 
1. if level = 1 then examine the size of the set (i.e. t). If the size of the set is 
less than or equal to -\/n then return. Otherwise use the current block to divide 
the set into at most three sets by using Lemma 1 to find the median and then 
using Lemma 3 to sort. For the set all of its elements are equal to the median 
eliminate the current block and note the next block to become the current block. 
Create a label which is the set number (0, 1 or 2 because the set is divided into 
at most three sets) for each integers. Then reverse the computation to route the 
label for each integer back to the position where the integer located in the input 
to the procedure call. Also route a number (a 2 bit number) for each integer 
indicating the current block back to the location of the integer. This is possible 
because we can assume each block has at least log log n bits. Return. 
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2. Cut the bits in each integer into equal two segments (high order 

bits) and (low order bits). Pack into half the number of words. Call 

Sort(^e?;e^ — 1, , ..., /*When the algorithm returns from this 

recursive call the label for each integer indicating the set the integer belongs is 
already routed back to the position where the integer locates in the input of the 
procedure call. */ 

3. For each integer extract out which has half the number of bits as in 
Oi and is a continuous segment with the most significant block being the current 
block of Oi. Pack af"°“’s into half the number of words as in the input. Route 
of°™’s to their sets by using Lemma 3. 

4. For each set S = {aip, , ..., Oi,} call Sort(k'i;e/ — 1, ..., o£°“'). 

5. Route the label which is the set number for each integers back to the position 
where the integer located in the input to the procedure call. Also route a number 
(a 2{level + 1) bit number) for each integer indicating the current block back to 
the location of the integer. This step is the reverse of the routing in Step 3. 

In Step 3 of algorithm Sort we need to extract af"°’"’s and to pack them. The 
extraction requires a mask. This mask can be computed in O(loglogn) time 
for each word. Suppose k small integers each containing (log n) / (4fc) blocks are 
packed in a word. We start with a constant which is 

when represented as a binary string, where t is the number of bits in a block. 
Because a 2(level+l) bit number a is used to note the current block we can check 
1 bit of a in a step for all a’s packed in a word (there are k of them). This can 
determine whether we need to shift the for each small integer to the 

left or not. Thus using O(loglogn) time we can produce the mask for each word. 
Suppose the current block is the ((logn)/(8fc) + 5 )-th block then the resulting 
mask corresponding to this small integer will be 

Packing is to pack s < logn blocks to consecutive locations in a word. This can 
be done in O(loglogn) time for each word by using the packing algorithm in 
^(Section 3.4.3). 

We let a block contain (4 log m) / log n bits. Then if we call Sort(log((log n) /4) , 
tto, oi, ..., a„_i) where a^’s are the input integers, then (logn)/4 calls to the level 
1 procedure will be executed. This could split the input set into 3(i°s")/4 ggfo^ 
And therefore we need fog 3*^^°®”^/^ bits to represent /index each set. When the 
procedure returns the number of eliminated bits in different sets could be dif- 
ferent. Therefore we need modify our procedure a little bit. At level j we form 
extract out the 2^~^ continuous blocks with the most significant block 
being the current block from a^. After this modification we call Sort six times 
as below: 

Algorithm IterateSort 

Call Sort (log( (log n) /4) , oq , ai , - • , On- 1 ) ; 

for j = 1 to 5 do 
begin 

Move Oi to its set by bucket sorting because there are only about ^/n sets; 
For each set S = {aig, , ..., ai^ \ if t > ^/n then 
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call Sort (log( (log n) /4) , ai„ , , . . . , o* J ; 

end 

Then (3/2) log n calls to the level 1 procedure are executed. Blocks can be 
eliminated at most logn times. The other (1/2) log n calls are sufficient to par- 
tition the input set of size n into sets of size no larger than ^/n. 

At level j we use only 7j/2*°s((*°g”)/4)“f words to store small integers. Each 
call to the Sort procedure involves a sorting on labels and a transposition of 
packed integers (use Lemma 3) and therefore involves a factor of log logn in 
time complexity. Thus the time complexity of algorithm Sort is: 

T{level) = 2T{level - 1) -k cn log log n/2'°s(('°8")/4)-'e"®'; 

T(0) = 0. 

where c is a constant. Thus T(log((log n)/4)) = 0(n (log log n)^). Algorithm 
IterateSort only sorts sets into sizes less than ^Jn. We need another recursion to 
sort sets of size less than y/n. This recursion has O (log logn) levels. Thus the 
time complexity to have the input integers sorted is 0(n(log log n)^). 

The sorting process is not stable. Since we are sorting arbitrarily large inte- 
gers we can append the address bits to each input integer to stabilize the sorting. 
Although this requires that each word contains log to -I- logn bits, when m > n 
the number of bits for each word can be kept at log m by using the idea of radix 
sorting. 

The space used for each next level of recursion in Sort uses half the size of 
the space. After recursion returns the space can be reclaimed. Thus the space 
used is linear, i.e. 0(n). 

Theorem 1: n integers can be sorted in linear space in time 0(n(log log n)^). 

□ 



4 An Algorithm with Time Complexity 0(n(log log n)^) 

We first note the following Lemma. 

Lemma 5: If the word length used in the algorithm is log to log logn. then n 
integers in {0,1,..., to — 1} can be sorted into sets of size no larger than in 
linear space in time 0(n log logn). 

Proof: In this case the median finding takes linear time and we can use Lemma 
4 to sort packed small integers. Also it takes O(loglogn) time to extract out 
and a/'°’"’s for log logn words (including computing mask and packing) 
because we can pack log logn words further into one word. Therefore formula 
(1) becomes: 

T{level) = 2T{level - 1) + cn/2'°s(('°sn)/4)-ievei. (2) 

T(0) = 0. 

Therefore T(log((log n)/4)) = 0(n log logn). That is, the time complexity 
for dividing the input set to sets of size no larger than ^/n is 0(n log logn). □ 
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We apply the following technique to improve the time complexity of our 
algorithm further. 

We divide log m bits of an integers into log n log log n blocks with each block 
containing (log to) /( log n log log n) bits. Note that each block has at least log log n 
bits because we can assume that log to > logn(loglogn)^ for otherwise we can 
use radix sort to sort the integers. We execute the following algorithm: 
Algorithm SpeedSort 

while there is a set S which has size > -y/n do 

begin 

1 . for each integer ai G S extract out a' which contains log n continuous 
blocks of Qi with the most significant block being the current block, put 
all a(’s in S'; 

2. Call IterateSort on set S'; 

end 

Now since during the sorting process each word stores only (log n)/4 blocks 
therefore only log to/ log log n bits are used. By Lemma 5 one iteration of the 
while loop in SpeedSort takes O(loglogn) time for each integer. We account 
the time for each integer in the sorting process by two variables D and E. 
If an integer a has gone through t iterations of the while loop of SpeedSort 
then {t — l)logn blocks of a has been eliminated we add 0{{t — l)loglogn) 
to variable E indicating that that much time has been expended to eliminate 
(t — 1) logn blocks. We also add O (log log n) time to variable D indicating that 
that much time has been expended to divide the set in SpeedSort. Because we 
can eliminate at most log log n log n blocks therefore the value of E is upbounded 
by 0((loglogn)^) throughout the integer sorting process including the call to 
SpeedSort to dividing integers into sets of size < ^/n and recursive calls to 
SpeedSort to finish dividing the resulting sets into singleton sets. The value of 
variable D is also upbounded by 0((log log n)^) because there are log log n levels 
of recursion (one level divides set of size to sets of size ) to divide 

integers into singleton sets. Therefore we have 

Theorem 2: n integers can be sorted in linear space in 0(n(loglogn)^) time. 

□ 



5 Nonconservative Sorting and Parallel Sorting 

When the word length is k log{m + n) for sorting integers in {0, 1, ..., to — 1} we 
modify algorithm Sort in section 4. Here whenever we sort t bits in the integer 
we can move tk bits in step 3 of Sort. Thus in step 2 of Sort we can divide 
Oi into equal k segments. Subsequently we can invoke recursion k times. Each 
time we sort on a segment. Immediately upon the finish of each recursion we 
move Ui to its sorted set. We can move the whole ai instead of a segment of Ui 
because we have the advantage of the nonconservatism. Therefore algorithm Sort 
can be done in 0(n(log log n)^/ log fc) time if each integer has only 0(logTO/A:) 
bits. Here we assume that transposition is done in O(nloglogn) time for n 
words. If we apply the technique in section 5 then in each pass we are sorting 
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only 0(logm/(fcloglogn)) bits for each integer. And therefore we can assume 
that the transposition can be done in 0(n) time for n words. Therefore the 
time complexity for algorithm Sort becomes 0(n log log n/ log A:). Since there 
are O(loglogn) calls to Sort which are are made in the whole sorting process, 
the time complexity of our nonconservative sorting algorithm to sort n integers 
is 0(n(loglogn)^/logfc). 

Theorem 3: n integers in {0, 1, ..., m— 1} can be sorted in 0(n(loglogn)^/ log k) 
time and linear space with word length klog(m + n), where 1 < A: < logn. 

Concerning parallel integer sorting we note that on the EREW PRAM we 
can have the following lemma to replace Lemma 1. 

Lemma 6: An integer a among the n integers packed into n/k words can be 
computed on the EREW PRAM in O(logn) time and 0{n/k) operations using 
0{n/k) space such that a is ranked at least n/4 and at most 3n/4 among the n 
integers. 

The proof of Lemma 6 can be obtained by applying Cole’s parallel selection 
algorithm p]. Note that we can do without packing and therefore the factor log A: 
does not show up in the time complexity. 

Currently Lemma 2 cannot be parallelized satisfactorily. On the EREW 
PRAM the currently best result|Zj sorts in O(logn) time and 0{riy/logn) oper- 
ations. To replace Lemma 2 for parallel sorting we resort to nonconservatism. 
Lemma 7: If A: = 2* integers using a total of (logn)/2 bits are packed into 
one word, then the nk integers in n words can be sorted in 0(log n) time and 
0{n) operations on the EREW PRAM using 0(n) space, provided that the word 
length is l7((logn)^). 

The sorting of words in Lemma 7 is done with the nonconservative sorting 
algorithm in |2|. The transposition can also be done in 0{n) operations because 
of nonconservatism. 

For Lemma 3 we have to assume that logm = l7((logn)^). Then we can sort 
the n integers in n/k words by their labels in 0(log n) time and 0((n log log n) / k) 
opearations on the EREW PRAM using 0{n/k) space. Note here that labels are 
themselves being sorted by nonconservative sorting algorithm in Lemma 7. Note 
also that the transposition here incurs a factor of log logn in the operation 
complexity. 

Lemma 4 and Section 5 say how do we remove the factor log logn from the 
time complexity incurred in transposition with nonconservatism. This applies 
to parallel sorting as well to reduce the factor of log log n from the operation 
complexity. 

Because algorithm Sort uses algorithms in Lemmas 1 to 3 O(logn) times 
and because we can now replace Lemmas 1 to 3 with corresponding Lemmas for 
parallel computation, algorithm Sort is in effect converted into a parallel EREW 
PRAM algorithm with time complexity 0((logn)^) and operation complexity 
0(n(loglogn)^). The technique in section 5 applies to parallel sorting. Therefore 
we have 

Theorem 4: n integers in {0, 1, ...,m— 1} can be sorted in 0((logn)^) time and 
0(n(loglogn)^) operations provided that logm = l7((logn)^). 
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Note that although algorithm Sort takes 0((logn)^) time, the whole sorting 
algorithm takes 0((logn)^) time as well because subsequent calls to Sort takes 
geometrically decreasing time. 

6 Improving the Complexity 

The results given in the previous sections can be improved further by using our 
approach and technique alone. In particular, 0(n(loglogn)^/logloglogn) time 
can be achieved for sequential linear space conservative sorting. 

0(n(loglogn)^/ logloglogn) operations and 0((logn)^) time can be achived for 
parallel linear space sorting on EREW PRAM. And 0(n(log log n)^/(log fc + 
log log log n)) time can be achieved for nonconservative linear space sorting with 
word length fclog(77i + n), where k < logn. The techniques for these improve- 
ments are too involved and warrants too much space for presentation. Therefore 
the details of these algorithms are omitted here and will be given in the full 
version of the paper. We instead show how to combine our algorithm with that 
of Thorup’s El to obtain an algorithm with time complexity 0(n(loglogn)^'^). 

In m Thorup builds an exponential search tree and associates buffer B{v) 
with each node v of the tree. He defines that a buffer B{v) is over-full if |H(u)| > 
d{v), where d{v) is the number of children of v. Our modification on Thorup’s 
approch is that we define B{v) to be over-full if |H(t))| > (d{v))^ . Other aspects 
of Thorup’s algorithm are not modified. Since a buffer is ffushed (see Thorup’s 
definition HD) only when it is over-full, using our modification we can show 
that the time for flush can be reduced to |R(r!)|v^loglogn. This will give the 
0(n(loglogn)^ ®) time for sorting by Thorup’s analysis [TT|. 

The flush can be done in theory by sorting the elements in B(v) together with 
the set D{v) of keys at v’s children. In our algorithm this theoretical sorting is 
done as follows. First, for each integer in B{v), execute log n steps of the 
binary search on the dictionary built in Section 3 of HH- After that we have 
converted the original theoretical sorting problem into the problem of sorting 
|R(r))| integers (come from B{v) and denoted by B'{v)) of log m " bits 
with d{v) integers (coming from D{v) and denoted by D'{v)) of log m/ 2 " 
bits. Note that here a word has logm bits. Also note that |H(z;)| > (d(r;))^ and 
what we needed is to partition B'{y) by the d{y) integers in D'(y) and therefore 
sorting all integers in B' (v)UD' (v) is not necessary. By using the nonconservative 
version of the algorithm Sort we can then partition integers in B'(y) into sets 
such that the cardinality of a set is either < yj d{y) or all integers in the set 
are equal. The partition maintains that for any two sets all integers in one set 
is larger than all integers in anther set. Because we used the nonconservative 
version of Sort the time complexity is 0(|R(z;)|-\/loglogn) . Then each integer 
in D(v) can find out which set it falls in. Since each set has cardinality no larger 
than yj d{v) the integers in D{y) can then further partition the sets they fall in 
and therefore partitioning B{v) in an additional 0(|R('i;)|) time. Overall the fiush 
thus takes 0(|R('(;)|\/log logn) time. By the analysis in El the time complexity 
for sorting in linear space is thus 0(n(loglogn)^'®). 
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Theorem 5: n integers in {0,1,..., to — 1} can be sorted in 0(n(log log 
time and linear space. 

7 Conclusions 

The complexity of our algorithm could be improved further if we could sort better 
in Lemma 2, i.e. either to sort stably with the same complexity or to sort more 
bits in a word instead of (logn)/2 bits. It is not clear whether 0(n(log log 
time is the lower bound for sorting integers in linear space. Note that our bound 
is very close to the current bound for sorting integer in nonlinear space (which is 
0(n log log n))P]p. Also note that ThorupP2] showed that 0(n log log n) time 
and linear space can be achieve with randomization. It would be interesting to 
see whether the current 0(n log log n) time complexity for nonlinear space and 
randomization can be achieved in linear space deterministically. 
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Abstract. Dutton (1993) presents a further HEAPSORT variant called 
WEAK-HEAPSORT, which also contains a new data structure for pri- 
ority queues. The sorting algorithm and the underlying data structure 
are analyzed showing that WEAK-HEAPSORT is the best HEAPSORT 
variant and that it has a lot of nice properties. 

It is shown that the worst case number of comparisons is n[logn] — 
2 flog "1 ri — [logn] < nlogn -|- O.ln and weak heaps can be generated 

with n — 1 comparisons. A double-ended priority queue based on weak- 
heaps can be generated in n -|- [n/2] — 2 comparisons. 

Moreover, examples for the worst and the best case of WEAK-HEAP- 
SORT are presented, the number of Weak-Heaps on {1, . . . , n} is deter- 
mined, and experiments on the average case are reported. 



1 Introduction 

General sequential sorting algorithms require at least [log(n!)] = nlogn — 
nloge -b 0(logn) « nlogn — 1.4427n key comparisons in the worst case and 
[log(n!)] — < |"log(n!)] — 1 comparisons in the average case. We as- 

sume that the time for all other operations should be small compared to the time 
of a key comparison. Therefore, in order to compare different sorting algorithms 
the following six criteria are desirable: 

1. The sorting algorithm should be general, i.e., objects of any totally ordered 
set should be sorted. 

2. The implementation of the sorting algorithm should be easy. 

3. The sorting algorithm should allow internal sorting, i.e., beside the space 
consumption for the input array only limited extra space is available. 

4. For a small constant c the average case on key comparisons should be less 
than n log n -\- cn. 

5. For a small constant d the worst case on key comparisons should be less 
than n log n -\- c'n. 

6. The number of all other operations such as exchanges, assignments and other 
comparisons should exceed the number of key comparisons by at most a 
constant factor. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 254-|2nSI 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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BUCKETSORT and its RADIXSORT variants (Nilsson (1996)) are not gen- 
eral as required in Property 1. 

Given n = 2^ traditional MERGESORT performs at most nlogn — n -I- 1 
key comparisons, but requires 0(n) extra space for the objects, which violates 
Property 3. Current work of J. Katajainen, T. A. Pasanen, and J. Teuhola (1996), 
J. Katajainen and T. A. Pasanen (1999), and K. Reinhardt (1992) shows that 
MERGESORT can be designed to be in-place and to achieve promising results, 
i.g. nlogn— 1.3n-|-0(logn) comparisons in the worst case. However, for practical 
purposes these algorithms tend to be too complicated and too slow. 

INSERTIONSORT (Steinhaus (1958)) invokes less than + 1)1 = 

log(n!) -|- n — 1 key comparisons, but even in the average case the number of 
exchanges is in 0{n^) violating Property 6. 

SHELLSORT (Shell (1959)) requires as an additional input a decreasing 
integer sequence = 1. According to these distances of array indices 

traditional INSERTIONSORT is invoked. The proper choice of the distances is 
important for the running time of SHELLSORT. In the following we summarize 
the worst case number of operations (comparisons and exchanges). For n = 
2^ Shell’s original sequence (2^, . . . , 2, 1), leads to a quadratic running time. A 
suggested improvement of Hibbard (1963) achieves 0(n^/^) with the analysis 
given by Papernov and Stasevich (1965). Pratt (1979) provided a sequence of 
length 6>(log^n) that led to 6>(nlog^n) operations. Sedgewick (1986) improves 
the 0(n^/^) bound for sequences of maximal length O(logn) to 0(n^^^) and in 
a joint work with Incerpi (1985) he further improves this to 0(n^“'"‘’/'^/*°®") for a 
given e > 0. Based on incompressibiliy results in Kolmogorov complexity, a very 
recent result of Jiang, Li and Vitanyi (1999) states that the average number of 
operations in (so-called p pass) SHELLSORT for any incremental sequence is in 
L2{pn^^^/^). Therefore, SHELLSORT violates Properties 4. and 5. 

QUIOKSORT (Hoare (1962)) consumes 0{n^) comparisons in the worst case. 
For the average case number of comparisons V (n) we get the following recurrence 
equation V(n) = n-l-h^ (k—l)-hV(n—k)). This sum can be simplified 

to V(n) = 2(n -L l)iJ„ — 4n, with iL„ = to the approximation 

V{n) « 1.386nlogn — 2.846n-|- O(logn). 

Hoare also proposed OLEVER-QUIOKSORT, the median-of-tree variant of 
QUIOKSORT. In the worst case we still have 0{n^) key comparisons but the 
average case number can be significantly reduced. A case study reveals that 
the median of three objects can be found in 8/3 comparisons on the average. 
Therefore, we have n — 3-I-8/3 = n — 1/3 comparisons in the divide step 
leading to the following recurrence for the average case V (n) = n — 1/3 -I- 
(p ^ — l){n — k)(y {k — 1) V{n — k)). This sum simplifies to V (n) = 

^(n-|-l)i7„-i-f|yn-|-f|f-l- « 1.188nlogn-2.255n-|-0(logn) (Sedgewick 
(1977)). No variant of QUIOKSORT is known with nlogn -I- o(nlogn) compar- 
isons on average (cf. van Emden (1970)). Hence, it violates Properties 4. and 5. 

The worst case case number on key comparisons in HEAPSORT indepen- 
dently invented by Floyd (1964) and Williams (1964) is bounded by 2nlogn-|- 
0(n). For the generating phase less than 2n — 2 comparisons are required. 
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BOT-TOM-UP-HEAPSORT (Wegener (1993)) is a variant of HEAPSORT 
with 1 .5n log n+0(n) key comparisons in the worst case. The idea is to search the 
path to the leaf independently to the place for the root element to sink. Since the 
expected depth is high this path is traversed bottom-up. Fleischer (1991) as well 
as Schaffer and Sedgewick (1993) give worst case examples for which BOT-TOM- 
UP-HEAPSORT requires at least 1.5nlogn — o(n log n) comparisons. Based on 
the idea of Ian Munro (cf. Li and Vitanyi (1992)) one can infer that the average 
number of comparisons in this variant is bounded by nlogn -|- 0(n). 

MDR-HEAPSORT proposed by McDiarmid and Reed (1989) performs less 
than nlogn -|- cn comparisons in the worst case and extends BOT-TOM-UP- 
HEAPSORT by using one bit to encode on which branch the smaller element 
can be found and another one to mark if this information is unknown. The 
analysis that bounds c in MDR-HEAPSORT to 1.1 is given by Wegener (1993). 
WEAK-HEAPSORT is more elegant and faster. Instead of two bits per element 
WEAK-HEAPSORT uses only one and the constant c is less than 0.1. 

In order to ensure the upper bound nlogn -I- 0(n) on the number of compar- 
isons, ULTIMATE-HEAPSORT proposed by Katajainen (1998) avoids the worst 
case examples for BOT-TOM-UP-HEAPSORT hy restricting the set of heaps to 
two layer heaps. It is more difficult to guarantee this restricted form. Katajainen 
obtains an improved bound for the worst case number of comparisons but the av- 
erage case number of comparisons is larger as for BOT-TOM-UP-HEAPSORT. 
Here we allow a larger class of heaps that are easier to handle and lead to an 
improvement for the worst case and the average case. 

There is one remaining question: How expensive is the extra space con- 
sumption of one bit per element? Since we assume objects with time-costly 
key comparisons we can conclude that their structure is more complete than 
an integer, which on current machines consumes 64 bits to encode the interval 
[—2®^ — 1, 2®® — 1]. Investing one bit per element only halves this interval. 

The paper is structured as follows. Firstly, we concisely present the design, 
implementation and correctness of the WEAK-HEAPSORT algorithm with side 
remarks extending the work of Dutton (1993). Secondly, we determine the num- 
ber of Weak-Heaps according to different representation schemas. Afterwards we 
prove that there are both worst and best case examples that exactly meet the 
given bounds. Finally, we turn to the use of Weak-Heaps as a priority queue. 



2 The WEAK-HEAPSORT Algorithm 

2.1 Definition Weak-Heap and Array-Representation 

A (Max-) Weak-Heap is established by relaxing the heap condition as follows: 

1. Every key in the right subtree of each node is smaller than or equal to the 
key at the node itself. 

2. The root has no left child. 

3. Leaves are found on the last two levels of the tree only. 
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Fig. 1. Example of a Weak- Heap. 



An example of a Weak-Heap is given in Figure ^ The underlying structure 
to describe the tree structure is a combination of two arrays. First, we have the 
array a in which the objects are found and second, the array of so-called reverse 
bits represents whether the tree associated with a node is rotated or not. 

For the array representation we define the left child of index i as 2i -\- 
and the right child as2f-|-l — rj,0 < i < n — 1. Thus by flipping we 
exchange the indices of the right and left children. The subtree of i is there- 
fore rotated. For example, one array representation according to Fig. Eis a = 
[14, 11, 9, 13, 8, 10, 12, 7, 0, 4, 5, 2, 1, 6, 3] and r = [0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1]. 

The successors of a node on the left branch of the right subtree are called 
grandchildren. For the example of Figure Q the grandchildren of the root are 
labeled by 11, 13, 7 and 3. The inverse function Gparent(x) (for grandparent) 
is defined as Gparent (Parent (x) ) in case a; is a left child and Parent (x) if 
a; is a right one. Gparent (x) can be calculated by the following pseudo-code 
while (oddCa:) = r^/ 2 ') x <— xj2 followed by return x/2 with an obvious in- 
terpretation of odd. 



2.2 Generating Phase 

Let y be the index of the root of a tree T and x be the index of a node with 
(lx > O.Z for all z in the left subtree of T and let the right subtree of T and y 
itself be a Weak-Heap. Merging x and y gives a new Weak-Heap according to the 
following case study. If ax > ay then the tree with root x and right child T is 
a Weak-Heap. If, however, ay > Qx we swap ay with ax and rotate the subtrees 
in T. By the definition of a Weak-Heap it is easy to see that if the leaves in T 
are located only on the last two levels merging x with y results in a Weak-Heap. 
The pseudo-code according to merge is given by if {ax < ay) swap(aa;,ay) ; 
ry^l- ry. 

In the generating phase all nodes at index i for decreasing i = n — 1, . . . , 1 are 
merged to their grandparents. The pseudo-code for the so-called WeakHeapify 
procedure can be specified as: f or i G {n — 1, . . . , 1} Merge (Gparent (i) , i) . 

Theorem 1. WeakHeapify generates a Weak-Heap according to its definition. 
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Proof. Assume that there is an index y, such that Merge (Gparent (y) ,y) does 
not return a Weak-Heap a,t x = Gparent (y). Then choose y maximal in this 
sense. Since all nodes w > y with Gparent (w) = x have led to a correct Weak- 
Heap, we have Ox > Oz for all t in the left subtree of root y. On the other hand 
y and its right subtree already form a Weak-Heap. Therefore, all preconditions 
of merging x with y are fulfilled yielding a contradicting Weak-Heap at root x. 

One reason why the WEAK-HEAPSORT algorithm is fast is that the gener- 
ating phase requires the minimal number of n — 1 comparisons. 

Note that the Gparent calculations in WeakHeapify lead to several shift 
operations. This number is linear with respect to the accumulated path length 
L{n), which can recursively be fixed as L(2) = 1 and L(2^) = 2 • -I- fc. For 

n = 2^ this simplifies to 2n — logn — 2. Therefore, the additional computations 
in the generating phase are in 0(n). 

2.3 Sorting Phase 

Similar to HEAPSORT we successively swap the top element uq with the last 
element am in the array, n — 1 > m > 2, and restore the defining Weak-Heap 
conditions in the interval [0...m — 1] by calling an operation MergeForest (m) : 

First of all we traverse the grandchildren of the root. More precisely, we 
set an index variable x to the value 1 and execute the following loop: while 
{2x -\- Tx < m) a; <— 2a; -F Tx. Then, in a bottom-up traversal, the Weak-Heap 
conditions are regained by a series of merge operations. This results in a second 
loop: while (a; > 0) Merge (0, a;); x <— x/2 with at most [fog (to -F 1)] key 
comparisons. 

Theorem 2. MergeForest generates a Weak-Heap according to its definition. 

Proof. After traversing the grandchildren set of the root, x is the leftmost leaf 
in the Weak-Heap. Therefore, the preconditions to the first Merge operation are 
trivially fulfilled. Hence the root and the subtree at x form a Weak-Heap. Since 
the Weak-Heap definition is reflected in all substructures for all grandchildren y 
of the root we have that y and its right subtree form a Weak-Heap. Therefore, 
we correctly combine the Weak-Heaps at position 0 and y and continue in a 
bottom-up fashion. 



2.4 The WEAK-HEAPSORT A\g,ov\t\im 

WEAK-HEAPSORT combines the generating and the sorting phase. It invokes 
WeakHeapify and loops on the two operations swapCO ,m) and MergeForest (m) . 
Since the correctness has already been shown above, we now turn to the time 
complexity of the algorithm measured in the number of key comparisons. 

Theorem 3. Let k = [log n~\ . The worst case number of key comparisons of 
WEAK-HEAPSORT is hounded by nk — 2^ -\- n — 1 < nlogn -F 0.086013n. 
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Proof. The calls MergeForest(i) perform at most Rog(* + 1)1 = nk — 2^ 

comparisons (and at least riog(* + 1)1— l = nfc — 2^ — n + 2 comparisons) . 

Together with the n — 1 comparisons to build the Weak-Heap we have nk — 2^ + 
n — 1 comparisons altogether. Utilizing basic calculus we deduce that for all n 
there is an x in [0, 1] with nk — 2^ + n — 1 = nlogn + nx — n2^ + n — 1 = 
nlogn + n{x — 2“ + 1) — 1 and that the function f{x) = x — 2^ + 1 takes it 
maximum at xg = — Inln2/ln2 and f{xg) = 0.086013. Therefore, the number 
of key comparisons in WEAK-HEAPSORT is less than nlogn + 0.086013n. 

3 The Number of Weak-Heaps 

Let W{n) be the set of roots of complete subtrees in the Weak-Heap of size n. 

Theorem 4. If the input of WEAK-HEAPSORT is a random permutation of 
the elements {1, . . . , n}, then every possible and feasible Weak-Heap occurs with 
the same probability. Moreover, there are n!/2l'^^"^l different Weak-Heaps rep- 
resented as a binary tree. 

Instead of a formal proof (cf. |3) in Fig.|2|we give a simple example illustrat- 
ing the idea of the backward analysis: For n = 5 we find two roots of complete 
subtrees (these nodes are double-encircled). In the first step swapping the top 
two elements leads to a dead end, since the generated Weak-Heap becomes in- 
feasible. No further operation can move the (deepest) leaf to the leftmost branch 
as required for a correct input. Swapping 2 and 5 in the second step analogously 
leads to an infeasible Weak-Heap. Only in the following steps (according to roots 
of complete subtrees) both successor Weak-Heaps are feasible. 

On the other hand, the assignments to reserve bits uniquely determines which 
cases in Merge have been chosen in the generating phase. 

Theorem 5. There are n\ different array- embedded Weak-Heaps. 



4 The Best Case of WEAK-HEAPSORT 

This section proves Dutton’s conjecture (1992) that an increasing sequence of 
input elements leads to the minimal number of key comparisons. 

For the best case it will be sufficient that in every invocation of MergePorest 
the traversed path P to the leaf node, special path for short, terminates at the 
last position of the array. Subsequently, by exchanging the current root with the 
element at this position the path is pruned by one element. Fig. 0 depicts an 
example for this situation. 

Therefore, by successively traversing the 2i + successors from index 1 on- 
wards we end up at index n — 1. Hence, rt has to coincide with the binary 
encoding {bk . . . 60)2 of n — 1. More precisely, if r, , = b.^-i for i G {1, . . . , k}, 

then n — 1 is the last element of the special path. In case of the input Oi = i for 
i G {0, . . . , n — 1}, WeakHeapif y leads to rg = 0 and rj = 1 for j ^ P. Moreover, 
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Merge(0,l) 



Merge(0,2) 



Merge(l,3) 



Merge(0,4) 




Fig. 2. Backward analysis of WEAK-HEAPSORT. 



for Tj with j G P we get the binary representation of n — 1 as required. In other 
words, Heap{n) defined as 



'^Gparent{n—l),n—l ^ '^Gparent{n—2),n—2 ° 

° ’’”Gpareni( J -1) , [ J -1 ° - ° 

bi 1 

"’”Gpareni([i^J),[i^J ° ’’”Gpareni( J -1) , [ J -1 ° ° 



fcfc -2 

Gparent([^^i),[^^] Od 

correctly determines the transpositions according to the generating phase, 
where is the transposition of i and j if a is odd and the identity, otherwise. 
As an example consider P[eap{l5) = (14 3)^(13 6)^(12 1)^(1! 5)^ 

(10 2)1(9 4)1(8 0)1(7 3)0(6 l)i(5 2)i(4 0)i(3 l)i(2 0)i(l 0)b 

We now consider the second largest element n — 2 and assume that n — 2 has 
the binary encoding {ci . . . 09)2 • Further, let 0 denote the exclusive or operation. 

Lemma 1. Let Ui = i for i G {0, . . . , n — 1} be the input for WEAK-HEAP- 
SORT. After the generating phase the element n — 2 will be placed at position 
with i* = max{i \ bi-i 0 Cj_i = 1}. Moreover, for j < i < i* we have 
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Proof. If = 1 then n — 1 is a right child and n — 2 is a left child. Hence, 
Parent{n—2) = Gparent{n—1) and Gparent{n—2) = Gparent{Gparent{n—l)). 
Therefore, n — 2 will be finally located at position Gparent{n — 1) = 
Moreover, the key n — 2 at is larger than + 1 located at n — 1. 

Therefore, for j < i < i* we have as required. 

For the other case 6 q = 0 we first consider n — 1 ^ 2^. The leaf n — 1 is as 
long a left child as n — 2 is a right child. Therefore, n — 2 will be a left child at 
[ 2 ? 1 J with i* = max{i \ 6i_i 0 Ci_i = 1}. Since Gparent{Gparent{n — 1)) = 
Gparent{'i^^\) = Gparent{ [ J ) , n — 2 will finally be located at 
Now let n — 1 = 2^. In this case Gparent{n — I) is the root. Since for all i we 
have Parenf( ) = Gparent { ), the element n — 2 will eventually reach 
position 1 = ■ 

To complete the proof the monotonicity criterion remains to be shown. An 
element can escape from a Weak-Heap subtree structure only via the associated 
root. The position Gparent{n — 1) will be occupied by n — 1 and for all elements 
on P we know that the key at position \J^^\ is equal to maa;{{[^^J} U {k \ 
k G rr([^^J)}} = + 2®“^, with rT{x) denoting the right subtree of x. 

Therefore, for all j < i < i* we conclude that the key at is larger than 

the key at . Since a, n-i , = n — 2, this condition also holds at f = i*. 

Lemma 2. After the initial swap of position 0 and n-1 MergeForest invokes 
the following set of transpositions GaseB{n) := o . . . o .^-^'=^- 1 ®'='=-! ^ 

Proof. Lemma Q] proves that all swaps of position with position 0 with 

j < i* are executed, since at the root we always find a smaller element of the 
currently considered one. We also showed that the maximum element n — 2 
is located at Therefore, no further element can reach the root. This 

corresponds to the observation that only for j > i* we have bj-i © Cj-i = 0. 

For the example given above we have CaseB{15) = (0 7)^(0 3)^(0 1)°. 

The proof of the following two results 

Lemma 3. Heap{n) o (0 n — 1) = {Gparent{n — 1) n — 1) o Heapfn). 

Lemma 4. Heap{n — l)~^o{Gparent{n—l) n — 1) o Heap{n) = CaseB{n)~^ . 

is technically involved and can be found [5|. 

Lemma 5. Heap{n)oSwap{Q,n—l)oGaseB{n)oHeap{n — l)~^ = *d{o,...,n-i} • 

Proof. By right and left multiplication with Heap{n — 1) and Heap{n — 1)”^ the 
equation Heap{n) o Swap{Q,n — 1) o CaseB{n) o Pleapfn — 1)“^ = fd{o,...,ri-i} 
can be transformed into Heap{n — 1)“^ oHeap{n) o Swap{Q,n— 1) o GaseB{n) = 
*c^{o,...,ri-i}- By LemmaElthis is equivalent to Heap{n — 1)”^ o (Gparentfn — 
1) n — 1) o Pleapfn) o GaseB{n) = *d{o,...,n-i}- Lemma 0| completes the proof. 
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Continuing our example with n = 15 we infer n—l = 14 = (63 62 ^1 ^0)2 = 
(111 0)2 and n — 2 = 13 = (c 3 C 2 ci 03)2 = (110 1)2- Furthermore, 
Heap{U)-^ = (1 0)^2 0)i(3 1)°(4 0)^(5 2)^(6 l)i(7 3)1(8 0)i(9 4)i(10 2)^ 
(11 5)1(12 1)1(13 6)1. Therefore, 

(1 0)(2 0)(4 0)(5 2)(6 1)(7 3)(8 0)(9 4)(10 2)(11 5)(12 1)(13 6) 

(14 3)(13 6)(12 1)(11 5)(10 2)(9 4)(8 0)(6 1)(5 2)(4 0)(3 1)(2 0)(1 0) 

(0 14)(0 7)(0 

Inductively, we get the following result 

Theorem 6. The best case of WEAK-HEAPSORT is met given an increasing 
ordering of the input elements. 



5 The Worst Case of WEAK-HEAPSORT 



The worst case analysis, is based on the best case analysis. The main idea is that 
the special path misses the best case by one element. Therefore, the Merge calls in 
MergeForest (m) will contain the index to — 1. This determines the assignment 
of the reverse bits on P: If n — 2 = {bk ■ ■ ■ ^0)2 and if = bi-i for all 

i G {1, ... ,k} then n — 2 is the last element of P. 

An appropriate example fulfilling this property, is the input Cj = i + 1 with 
i € {0, . . . , n — 2} and a„_i = 0. After termination of WeakHeapify we have rg = 
0, rj = 1 for j ^ P, rn-i = 0, r„_2 = 1, and = 6i_i for i G {1, . . . , k}. 

The transpositions Heap{n) of the Weak-Heap generation phase are: 



^Gparent{n — 2),n — 2 ^ ^Gparent{n—2),n—2 ^ '^Gparent(n—2).n—2 ® ® 

"l"Gparent( ° "^Gparent( J - 1). J -1 ° 

"l"Gparent( ° "^Gparent( J - l)d^^ J -1 ° ° 



^^k-2 

Gparent{[^G?r\)d^F?r\ 



O Tr 



bk-l 



0,1 



Unless once per level (when the binary tree rooted at position 1 is complete) 
we have [log(n + 1)] instead of [log(n + 1)] — 1 comparisons. If n — 1 is set to 
n — 2 Lemmanremains valid according to the new input. Therefore, we conclude 



Lemma 6. The first invocation of MergeForest (with the above input) leads to 
following set of transpositions CaseW(n) = Tq „_2 ° , o ... o ■ 

The following two results are obtained by consulting the best case analysis. 



Lemma 7. Heap(n) o (0 n — 1) = (n — 2 n — 1) o Heap(n). 



Lemma 8. CaseW (n) = Swap(0, n — 2) o CaseB{n — 1) . 
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Since the definitions of Heap(n) are different in the worst case and best case 
analysis we invent labels Heapb{n) for the best case and Heapw(n) for the worst 
case, respectively. 

Lemma 9. 

Heapw{n) o Swap{0, n — 1) o CaseW (n) o Heapw{n — 1)“^ = {n — 2 n — 1). 

Proof. According to Lemma 0 the stated equation is equivalent to 

(n — 2 n — 1) o Heapw(n) o CaseW{n) o Heapw{n — 1)~^ = {n — 2 n—1). 

The observation Heapw{n) = Heapb{n—1) and Lemma|S|results in (n—2 n— 
l)oHeapb{n — l)oSwap{0,n — 2)oCaseB{n—l)oHeapb{n — 2)~^ = {n — 2 n—1), 
which is equivalent to Lemma 0 of the best case. 

Inductively, we get 

Theorem 7. The worst case of WEAK-HEAPSORT is met with an input of 
the form a„_i < ai < ai+\, i € {0, . . . , n — 3}. 

As an example let n = 16, n — 2 = 14 = (63 62 hi 60)2 = (111 0)2 
and n — 3 = 13 = (c 3 C 2 ci 03)2 = (110 1)2. Further let Heapwi^G) = 
(14 3)1(13 6)1(12 1)1(11 5)1(10 2)1(9 4)i(8 0)i(7 3)°(6 l)i(5 2)i(4 0)i 
(3 1)1(2 0)1(1 0)1, S'wap(0,16) = (0 15)i, CaseW(16) = (0 14)i(0 7)i(0 3)\ 
and i7eap(15)-i = (1 0)i(2 0)i(3 1)°(4 0)i(5 2)i(6 l)i(7 3)i(8 0)i(9 4)i 
(10 2)1(11 5)1(12 1)1(13 6)1. Then 7760^^(16)0 5^0^(0,16)0(70561^(16)0 
i7eop(15)-i = (14 15). 

6 The Average Case of WEAK-HEAPSORT 

Let d{n) be given such that n log n+d{n)n is the expected number of comparisons 
of WEAK-HEAPSORT. Then the following experimental data show that d{n) € 
[—0.47, —0.42]. Moreover d{n) is small for n « 2^ and big for n ~ 1.4 • 2^. 



n 


1000 


2000 


3000 


4000 


5000 


6000 


7000 


8000 


d{n) 


-0.462 


-0.456 


-0.437 


-0.456 


-0.445 


-0.429 


-0.436 


-0.458 


n 


9000 


10000 


11000 


12000 


13000 


14000 


15000 


16000 


d{n) 


-0.448 


-0.437 


-0.432 


-0.430 


-0.436 


-0.443 


-0.449 


-0.458 


n 


17000 


18000 


19000 


20000 


21000 


22000 


23000 


24000 


d{n) 


-0.458 


-0.449 


-0.443 


-0.437 


-0.433 


-0.431 


-0.436 


-0.427 


n 


25000 


26000 


27000 


28000 


29000 


30000 




d{n) 


-0.431 


-0.437 


-0.436 


-0.440 


-0.440 


-0.447 



There was no significant difference between the execution of one trial and the 
average of 20 trials. The reason is that the variance of the number of comparisons 
in WEAK-HEAPSORT is very small: At n = 30000 and 20 trials we achieved a 
best case 432657 and a worst case of 432816 comparisons. 

According to published results of Wegener (1992) and own experiments 
WEAK-HEAPSORT requires approx. 0.81n less comparisons than BOT-TOM- 
UP-HEAPSORT and approx. 0.45n less comparisons than MDR-HEAPSORT. 
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7 The Weak-Heap Priority-Queue Data Structure 

A priority queue provides the following operations on the set of items: Insert to 
enqueue an item and DeleteMax to extract the element with the largest key value. 
To insert an item u in a Weak-Heap we start with the last index x of the array 
a and put v vci ax- Then we climb up the grandparent relation until the Weak- 
Heap definition is fulfilled. Thus, we have the following pseudo-code: while {x yf 
0) and (aGparent(a:) < Swap(Gparent(a:), x); = 1 — x ^Gparent(x). 

Since the expected path length of grandparents from a leaf node to a root is 
approximately half the depth of the tree, we expect at about logn/4 comparisons 
in the average case. The argumentation is as follows. The sum of the length of 
the grandparent relation from all nodes to the root in a weak-heap of size n = 2^ 
satisfy the following recurrence formula: 5(2^) = 1 and S'(2^) = 25(2^“^) -I- 2^“^ 
with closed form of nk/2 -\- n such that the average length is at about kj2 -\- 1. 

A double-ended priority queue, deque for short, extends the priority queue 
operation by DeleteMin to extract the smallest key values. The transformation 
of a Weak- Heap into its dual in [(n — 1)/2J comparisons is performed by the 
following pseudo-code: 

f or z = {size — 1, ... , {{size — 1)/2J -1-1} Swap(Gparent(z), z) followed by 

f or z = { [{size — 1)/2J , . . . , 1} Merge(Gparent(z), z) 

By successively building the two heaps we have solved the well-known min- 
max-problem in the optimal number of n -I- |"zz/2] — 2 comparisons. 

Each operation in a general priority queue can be divided into several com- 
pare and exchange steps where only the second one changes the structure. We 
briefly sketch the implementation. Let M be a Max- Weak-Heap and M' be a 
Mxn.- Weak- Heap on a set of n items a and o', respectively. We implicitly define 
the bijection (j) hy ai = In analogy we might determine <j)' for M' . The 

conditions ai = and a' = a^yy are kept as an invariance. Swapping j and 
k leads to the following operations: Swap aj and Ofc, exchange 4>{j) and 4>{k), 
set <j)'{(j){j)) to j and set (j)'{<j){k)) to k. We see that the invariance is preserved. 
A similar result is obtained if a swap-operation on M' is considered. 

8 Conclusion 

Weak-Heaps are a very fast data structure for sorting in theory and practice. The 
worst case number of comparisons for sorting an array of size n is bounded by 
n log n-\-0.1n and empirical studies show that the average case is at n log n-\-d{n)n 
with d{n) € [-0.47,-0.42]. Let k = [logn]. The exact worst case bound for 
WEAK-HEAPSORT is nk — 2^ -\- n — k and appears if all but the last two 
elements are ordered whereas the exact best case bound oink-2^-\-\ is found if 
all elements are in ascending order. On the other hand the challenging algorithm 
BOT-TOM-UP-HEAPSORT is bounded by 1.5nlogzz-|- 0{n) in the worst case. 
Its MDR-HEAPSORT variant consumes at most zzlogrz -I- I. In comparisons. 
Therefore, the sorting algorithm based on Weak-Heaps can be judged to be the 
fastest HEAPSORT variant and to compete fairly well with other algorithms. 
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Abstract. This paper shows that the collection of identities in two vari- 
ables which hold in the algebra N of the natural numbers with constant 
zero, and binary operations of sum and maximum does not have a finite 
equational axiomatization. This gives an alternative proof of the non- 
existence of a finite basis for N — a result previously obtained by the 
authors. 



1 Introduction 

Since Birkhoff’s original developments, equational logic has been one of the clas- 
sic topics of study within universal algebra. (See, e.g., jSj for a survey of results 
in this area of research.) In particular, the research literature is, among other 
things, rich in results, both of a positive and negative nature, on the existence 
of finite bases for theories (i.e. finite sets of axioms for them). 

In this paper, we contribute to the study of equational theories that are 
not finitely based by continuing our analysis of the equational theory of the 
algebra N of the natural numbers with constant zero, and binary operations of 
summation and maximum (written V in infix form). Our investigations of this 
equational theory started in the companion paper P|. In op. cit. we showed that 
the equational theory of N is not finitely based. Moreover, we proved that, for 
all n > 0, the collection of all the equations in at most n variables that hold in 
N does not form an equational basis. 

The equational theory of N is surprisingly rich in non-trivial families of 
identities. For example, the following infinite schemas of equations also hold in 

N: 

e„ : nxi V ... V nxn V {x\ + . . . + Xn) = nx\ \f . . .\J nXn 
: nxV ny = n{x V y) , 
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where n S N, and nx denotes the n-fold sum of x with itself. By convention, nx 
stands for 0 when n = 0, and so does the empty sum. 

Let Eq 2 (N) denote the collection of equations that hold in N containing 
occurrences of two distinct variables. A natural question suggested by the family 
of equations above is the following: Is there a finite set E of equations that 
hold in N such that E h Eq 2 (N)? This paper is devoted to proving that no 
finite axiomatization exists for Eq 2 (N). Apart from its intrinsic mathematical 
interest, this result offers yet another view of the non-existence of a finite basis 
for the variety generated by N proven in [5|. 

The proof of our main technical result is model-theoretic in nature, and fol- 
lows standard lines. The details are, however, rather challenging. More precisely, 
for every prime number p, we construct an algebra Ap in which all the equa- 
tions that hold in N and whose “measure of complexity” is strictly smaller than 
p hold, but neither Cp nor e'p hold in Ap. As a consequence of this result, we 
obtain that not only the equational theory of N is not finitely based, but not 
even the collection of equations in two variables included in it is. 

Although the proof of our main theorem uses results from 0 , we have striven 
to make the paper self contained. The interested reader is referred to op. eft. and 
the textbook ^ for further background information. Full proofs of our results 
may be found in [Q. 

Related Work. The algebra N, although itself not a semiring, is closely related 
to many instances of these structures whose addition operation (here, the max- 
imum of two natural numbers) is idempotent. Interest in idempotent semirings 
arose in the 1950s through the observation that some problems in discrete op- 
timization could be linearized over such structures (see, e.g., [E! for a survey). 
Since then, the study of idempotent semirings has forged productive connec- 
tions with such diverse fields as, e.g., automata theory, discrete event systems 
and non-expansive mappings. The interested reader is referred to ^ for a survey 
of these more recent developments. Here we limit ourselves to mentioning that 
variations on the algebra N, in the form of the so-called tropical semirings, have 
found deep applications in automata theory and the study of formal power series. 
The tropical semiring (N U {-|-oo}, min, -I-) was originally introduced by Simon 
in his solution to Brzozowski’s celebrated finite power property problem [n]|. It 
was also used by Hashiguchi in his independent solution to the aforementioned 
problem 0, and in his study of the star height of regular languages (see, e.g., 
(3). Further examples of applications of the tropical semiring may be found in, 
e.g., p|ll|. 

2 The Max-Sum Algebra 

Let N = (N, V,-|-,0) denote the algebra of the natural numbers equipped with 
the usual sum operation -I-, constant 0 and the operation V for the maximum of 
two numbers, i.e.. 



x\J y = max{x, y} . 
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We study the equational theory of the algebra N — that is, the collection Eq(N) 
of equations that hold in N. The reader will have no trouble in checking that the 
following axioms, that express expected properties of the operations of maximum 
and sum, hold in N: 

\Jlx\/y = y\Jx +1 X + y = y + X 

\/2 {x y y) y z = X y {y y z) +2 [x + y) + z = x + {y + z) 

V3 a; V 0 = a; -1-3 x -I- 0 = a: 

-l-V {xy y) + z = {x + z)y {y + z) 

This set of equations will be denoted by Ax\. Note that the equation {x+y)yx = 
x + y is derivable from -|-3 and -|-V, and, using such an equation, it is a simple 
matter to derive the idempotency law for V, i.e., 

y4 X y X = X . 

We denote by Axq the set consisting of the equations Vl, V2, V4, -1-1 — 1-3 and 
-hV. Moreover, we let Vq stand for the class of all models of Axq, and Vi for 
the class of all models of the equations Ax\. Thus, both Vq and Vi are varieties 
and, by the above discussion, Vi is a subvariety of Vq, i.e., Vi C Vq. 

Since the reduct {A, V) of any algebra A = {A, -b, V, 0) in Vq is a semilattice, 
we can define a partial order < on the set A by a < 5 if and only if a y b — b, 
for all a,b G A. This partial order is called the induced partial order. When A 
is in the variety Vi, the constant 0 is the least element of A with respect to <. 
Moreover, for any A G Vq, the V and -I- operations are monotonic with respect 
to the induced partial order. 

The axiom system Axi completely axiomatizes the collection of equations in 
at most one variable which hold in the algebra N. However, the interplay between 
the operations of maximum and sum generates some non-trivial collections of 
equations in two or more variables. For example, the infinite schemas of equations 
6n and (defined in Sect. 0, which will play an important role in the technical 
developments of this paper, also hold in N. It is not too difficult to see that, for 
any n, the equation e„ is derivable from Axi and e'^. 

Let Eq 2 (N) denote the collection of equations that hold in N containing 
occurrences of two distinct variables. The remainder of this paper is devoted to 
proving that no finite axiomatization exists for Eq 2 (N). 



3 Explicit Description of the Free Algebras 

In this section we give a brief review of some results on the equational theory of 
the algebra N that we obtained in |2| . We start by offering an explicit description 
of the free algebras in the variety V generated by N. Since N satisfies the 
equations in Axi, we have that V is a subvariety of Vi, i.e., V C Vi. 

For the sake of clarity, and for future reference, we shall describe the finitely 
generated free algebras in V. We recall that any infinitely generated free algebra 
is a directed union of the finitely generated free ones. 
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Let n > 0 denote a fixed integer. The set N” is the collection of all n- 
dimensional vectors over N. We use P/(N") to denote the collection of all finite 
non-empty subsets of N” , and define the operations in the following way: for all 
U,V€Pf(W), 



uyv :=U\JV 

U + V := {u + v :u & v &V} 

0 := { 0 } , 

where 0 stands for the vector whose components are all 0. For each i G [n] = 
,n}, let Ui denote the zth unit vector in N”, i.e., the vector whose only 
non-zero component is a 1 in the zth position. 

Proposition 1. The algebra Pf{W^ ) is freely generated in Vq by the n singleton 
sets {ui}, i G [n], containing the unit vectors. 

Note that the induced partial order on Py(N”) is given by set inclusion. 

It is easy to see that any term t in the variables xi, . . . ,Xn {n > 0) can be 
rewritten, using the equations in Axq, to the maximum of linear combinations of 
the variables x\,. . . , Xn, i.e., there are m > 1, and cl G N for j e [zz] and z G [m] 
such that the equation 



t= V ( H 

ie[m] je[n] 



holds in Vq. We refer to such terms as normal forms. Thus we may assume that 
any equation which holds in a given subvariety of Vq is in normal form, i.e., of 
the form ti = t 2 where t\ and t 2 are normal forms. Furthermore, an equation 
ti V . . . V tm = 1 1 V . ■ . V holds in a subvariety of Vq if and only if, for all 
z G [m] and j G [m'], 

ti < t'l V . . . V and t' < ti V . . . V tm 

hold in the subvariety. We refer to an inequation of the form t < tiV . . .V tm, 
where t,ti, ■ ■ ■ jtm are linear combinations of variables, as simple inequations. 

A simple inequation t < ti V . . . V that holds in N is irredundant if, for 
every j G [m], 

N ^ t < ti V . . . V tj-i V tj+i V ... V t™ . 

By the discussion above, we may assume, without loss of generality, that every 
set of inequations that hold in N consists of simple, irredundant inequations 
only. 

In order to give an explicit description of the finitely generated free algebras 
in Vi, we need to take into account the effect of equation V3. Let < denote the 
pointwise partial order on N” . As usual, we say that a set [/ C LF is an order 
ideal, if zz < zJ and v G U jointly imply that u G U, for all vectors u,vG N” . Each 
set U C W is contained in a least ideal {U]n, the ideal generated by U. The 
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relation that identifies two sets U,V G P/(N”) if (C/]„ = (V]„ is a congruence 
relation on and the quotient with respect to this congruence is easily 

seen to be isomorphic to the subalgebra I/(N") of Pf{W) generated by the finite 
non-empty ideals. 

For each i € [n], let (7ti]„ denote the principal ideal generated by the unit 
vector Ui, i.e., the ideal ({ui}]n. 

Proposition 2. is freely generated in Vi by the n principal ideals {ui]n- 

Again, the induced partial order on //(N”) is the partial order determined by 
set inclusion. 

We note that, if n > 2, then the equation e^, fails in //(FP), and a fortiori 
in Vi . Since for n > 2 the equation e„ holds in N but fails in Vi , in order to 
obtain a concrete description of the free algebras in V we need to make further 
identifications of the ideals in If{N^). Technically, we shall start with Py(Ff‘). 

Let vi, . . . ,Vk {k > 1) be vectors in IM™, and suppose that Xi {i G [A:]) are 
non-negative real numbers with ~ vector of real numbers 

E*G[fc] XiVi a convex linear combination of the vectors Vi (i G [fc]). 

Definition 1. We call a set U C a convex ideal if for any convex linear 

combination J2i^[k] Vi G U for all i G [k], and for any a G Ff*, if 



in the pointwise order, then v G U . 

Note that any convex ideal is an ideal. Moreover, the intersection of any number 
of convex ideals is a convex ideal. Thus, any subset {7 of FT* is contained in a 
least convex ideal, [U]n- When U is finite, so is [[/]„. We let c < U mean that 
the simple inequation 



holds in N. Then we have the following useful characterization of the simple 
inequations that hold in N. 

Lemma 1. Suppose that U G Pf{W) and c G FT*. Then c G [U]n iffc<U. 

As a corollary of Lemma Q we obtain the following alternative characterization 
of simple equations which hold in the algebra N. 

Corollary 1. Let c,dj (j G [m]) be vectors in FT*. Then c < {di, . . . ,dm} iff 
there are Ai, . . . , Am > 0 such that Ai -F . . . -F Am = 1 and c < Aidi -F . . . -F 
Xmdm with respect to the pointwise ordering. Moreover, ifc< {di,...,dm} is 
irredundant, then Ai, . . . , Am > 0. 
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The above result offers a geometric characterization of the simple inequations in 
Eq(N), viz. an inequation c-x<di-x\J...\f dm ■ x (where x = (a;i, . . . , a;„) is a 
vector of variables) holds in N iff the vector c lies in the ideal generated by the 
convex hull of the vectors c?i, . . . , dm- 

Let ~ denote the congruence relation on Pf{N^) that identifies two sets of 
vectors iff they generate the same least convex ideal. It is immediate to see 
that the quotient algebra P/(N”)/ ~ is isomorphic to the following algebra 
C//(N”) = (C//(N"), V, +, 0) of all non-empty finite convex ideals in 
For any two I,JG C//(N”), 

/ V J := [/ U J]n 

I + J := \{u + v:u&I, V € J}]„ 

0 := {0} . 

Indeed, an isomorphism P/(N”)/ C//(N") is given by the mapping U / 

[U]n- 

Recall that, for each i G [n], Ui denotes the zth unit vector in FI”. For each 
i G [n], the set = (uj]n = {ui>0} is the least convex ideal containing Ui. 

Theorem 1. CIf(W^) is freely generated by the n convex ideals [ui]„ in the 
variety V. 



4 The Two- Variable Fragment is not Finitely Based 

We now proceed to apply the results that we have recalled in the previous section 
to the study of the two variable fragment of the equational theory of the algebra 
N. The main aim of this paper is to prove the following result to the effect that 
the collection of equations Fq 2 (N) cannot be deduced using any finite number 
of equations in Fq(N). 

Theorem 2. There is no finite set E of equations in Fq(N) such that E h 

Eq2(N). 

To prove Thm. Qwe shall define a sequence of algebras A„ (n > 1) in Vi such 
that following holds: For any finite set E of equations which hold in N, there is 
an n such that 

Ati E but A,i ^ Cyj . 

Recalling that, for any n, the equation e„ is derivable from e(j and Ax\, it is 
sufficient to prove the statement above with e'^ replaced by e„. Furthermore, 
in light of our previous analysis, the result we are aiming at in this section, 
viz. Thm. may now be reformulated as follows. 

Proposition 3. Let E he a finite set of simple, irredundant equations such that 
N 1= F. Then there is an n GN and an algebra A„ G Vi such that A„ ^ E but 

An ^ Cji . 
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The non-existence of a finite axiomatization of the two variable fragment of the 
equational theory of N follows easily from Propn. Inland the preceding discussion. 
In fact, let E be any finite subset of Eq(N). Without loss of generality, we may 
assume that E includes Ax\, and that E \ Axi consists of simple, irredundant 
inequations. Then, by Propn. 0 there is an algebra A„ € Vi such that A„ \= E 
but An ^ 6n- Since e„ is derivable from Axi and e^, it follows that A„ ^ e^. 
We may therefore conclude that E 1/ Eq 2 (N), which was to be shown. 



4.1 The Algebras A„ 

We let the weight of a vector 7t € N”, notation |u|, be defined as the sum of 
its components. (Equivalently |it| =u- where i5„ = (1, . . . , 1).) To define the 
algebra A„, where n > 1 is a fixed integer, let us call a set / C FI" an n-convex 
ideal if it is an ideal and for any convex linear combination F = AiFi -I- . . . -h XmVm 
and vector u £ Ff* of weight |7t| < n, if Fi, . . . ,Vm S I and u < v, where < is 
the pointwise order, then F £ J. It is clear that any convex ideal in Ff* is n- 
convex. Any set [/ C FI” is contained in a least n-convex ideal, denoted |17]„. 
(The subscript n will often be omitted when it is clear from the context.) Call a 
vector F = (ui, . . . , z)„) £ N” n-ok, written okn{v), if |F| < n and 

|F| = n => £ [n]. = n . 

A set of vectors is n-ok if all of its elements are. Note that if {7 is a finite non- 
empty set consisting of n-ok vectors, then |[/] is also finite and contains only 
n-ok vectors. 

The algebra A„ consists of all non-empty (finite) n-convex ideals of n-ok 
vectors, as well as the element T. The operations are defined as follows: for all 
/, J £ An, /, J yf T, let A = {F -F F : F £ /, F £ J}. Then, 

j j f ^ contains only n-ok vectors 

1 T otherwise. 

/ V J := |/ U J] 

0 := { 0 } = [01 . 

Moreover, we define T-|-/ = TV/ = T, and symmetrically, for all / £ A„. 

Proposition 4. For each n > 1, A„ £ Vi. 

We shall now show that, for every n > 2, the algebra A„ is not in V. In partic- 
ular, if n > 2, then the equations e„ and e'n do not hold in A„. 

Lemma 2. // n > 2 then An ^ e„ and An ^ e'n- 

Note that the induced partial order on A„ has T as its top element, and coincides 
with the inclusion order over the elements in A„ that are different from T. 
For K £ PfiW'), by slightly abusing notation, we let \K\ stand for [A] in 
the original sense if A contains only n-ok vectors and T otherwise. With this 
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extension of the definition of |_] we have that lif] + |L] = \K + L\ for all 
K,L G Pf{W). Using this notation, denoting the induced preorder on A„ by 
we obtain the following alternative characterization of A„ as a quotient 
algebra of which will be used in the technical developments to follow. 

Let L,M G Pf^W). We write 

L < M Vu G L. u < M . 



Now, 



and 



where 



A„ = {IL] : LGPfifT)} 
m [M] iff L M 



L M [okn{M) ^ (okn{L) A L < M)] . 

Similarly Lemma [D provides us with the following characterization of C//(N"): 
ClfifT) = {[L] : L G PfifT)} 

where, by Lemma D 

[L] C[M]iS L< M . 



In what follows we let the simple inequation 

a^Xi + . . . + a^Xm < {c\xi + . . . + c^Xm) V ... V {clxi + . . . + C^Xm) 



be represented by (a*)*-"* < (or sometimes simply by a < C if the 

meaning is clear from the context), where (o*)*-"* denotes the the row vector 



(o^,---,a™) and the k xm matrix 



ct . . . cV 



ct.... Cl 



. We also let an in- 



stantiation of the variables x\,. . . , Xm by the singleton sets {? 7 i}, . . . , {rjm} (or 
equivalence classes generated by these sets) be represented as 



77 = 



'm ' 




'vl ■ 


1 










Jim. 




.Im- 


. .71^ 
Im J 



i.e. the matrix with row vectors rji, . . . ,rj^. We note that, by commutativity of 
V, the simple inequation a < B, where B is any matrix obtained by permuting 
the rows of C, represents exactly the same simple inequation as a < C, viz. the 
inequation a < U, where U = {ci,...,Cfe} and the q (i G [fc]) are the row 
vectors of C . Similarly any permutation of the column vectors of C combined 
with a corresponding permutation of the entries of a yields a simple inequation 
that holds in N iff a < C does. (Any instantiation matrix should be similarly 

permuted as well.) The weight of C, notation |C|, is defined as the sum of its 
entries. 



The following result will play a key role in the proof of Thm. 0 to follow. 
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Corollary 2. Assume that a < C holds in N and is irredundant. Then Ap ^ 
a < C, where p is a prime number with |a| < p and jCI < p, iff there is an 
instantiation matrix rj G such that 

1. a - rj has weight p and at least two non-zero coefficients, and 

2. Cj ■ rj has weight p and exactly one non-zero coefficient for j G [k] . 

Note that if rj is an instantiation matrix satisfying points 1-2 in the above 
statement, then so does any matrix obtained by permuting the columns of p. 
This observation will play an important role later in the proof of Thm. 0 

Using the previous result, we are now in a position to show the following 
theorem, which is the key to the proof of Propn. 0 and of Thm. 0 

Theorem 3. Ifa<C is irredundant and holds in N, then Ap ^ a < C for at 
most one prime number p > max{|a|, |C|}. 

Proof Assume that a = (a\...,a™) G and ^ = (c*)i|7 G (i.e. 

the inequation a < C contains at most m variables) and that N ^ a < C. 
Assume furthermore that the statement of the theorem does not hold and that 
m is the smallest number of variables for which it fails; we shall show that this 
leads to a contradiction. So suppose that N ^ o < C but that Ap ^ a < C 
and Aq ^ a < C, where p and q are prime and max{|a|, |C|} < p < q. First we 
note that we may assume that a* > 0 for i G [m], or else we could immediately 
reduce the number of variables in the equation under consideration. Since a < C 
holds in N, this implies that each column vector of C is non-zero. We now 
continue with the proof as follows: First we use the assumption Ap ^ a < C 
and Corollary 0 to analyze the structure of a and C. Then we argue that the 
result of this analysis contradicts our second assumption, viz. the failure of the 
inequation a < C in Ag. 

As Ap ^ a < C, by Corollary 0 there is an instantiation matrix 77 = 
(vlViiL e such that 

— a - rj has weight p and at least two non-zero coefficients and 

— Cj - rj has weight p and exactly one non-zero coefficient, for every j G [k]. 

By minimality of m, we may furthermore assume that each row of rj contains a 
non-zero entry. 

By rearranging the order of the rows of C (i.e. the order of the terms that 
occur on the right-hand side of the simple equation) and of the columns of p, we 
may assume that there is a, 0 < ko < k such that 

1. Cj ■ fj = {p,0, ... ,0) for j G [A:o] and 

2 . Cj -p = {0,lj, ... ,IJ^) for ko < j < k, where exactly one of the Z] is non-zero 
for 2 < i < m. 



276 



Luca Aceto, Zoltan Esik, and Anna Ingolfsdottir 



By rearranging the columns of C and correspondingly the order of the entries 
of a and rj (i.e. the order of the variables that occur in the equation), we may 
assume that there is an toq € [m] such that if j < fco then c* = 0 for all i > mo . 

Therefore we may suppose that C looks as follows: 




where Oi and O 2 are (possibly empty) 0-matrices (i.e., matrices whose entries 
are all 0) and ? just means that this part of the matrix is unknown. Recall that, 
as previously observed, some of the c® (i € [mo],j G [fco]) may be 0, but no 

column in the upper left corner of C is identically 0. 

We claim that ko < k and mo < m, and therefore that Oi and O 2 are both 
non-trivial. To see that this claim holds, note that, by 1.-2. above, rj has the 
form 



77 = 




where O is a 0-matrix. This follows because if some of the columns of the matrix 

'vl 

to the right of the first column (the one that starts with : ) has a non-zero 

_Vm^_ 

entry above the horizontal solid line, at least one of the Cj -fj^j G [fco] > is going 
to have more than one non-zero entry. Since the rows of fj are non-zero, we have 
that rjj^O for every j G [mo]. Using this fact it is not difficult to see that, 
as claimed, the cases where either ko = k or mo = m cannot occur. Indeed, if 
mo = m then fj has only one non-zero column and consequently a ■ fj cannot 
have two non-zero coefficients. Moreover, ii ko = k and mo < m then the right 
hand side of the inequation does not contain the variables Xi, with mo < i < m, 
whereas, by assumption, the left hand side does — a contradiction to the fact that 
the inequation a < C holds in N. We may therefore assume that ko < k and 
mo < m. 
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Now we recall from Cor. ^that, since a < C is an irredundant inequation 
that holds in N, there is a sequence of real numbers Xi, i G [k], such that 

a < (AiCi -h . . . -h XkgCkg) + (Afco + lCfeo + i -h . . . -h XkCk) , 

where Ai > 0 for i G [A:] and Ai -h . . . -h Afe = 1. Next let 

ai = (a^, . . . , 0 , . . . , 0 ) and 02 = ( 0 , . . . , 0, , a™) . 

Then 

fl = Ol -h 02) 

at < AiCi -I- ... -I- XkoCko and 
02 < XkQ + lCkg + 1 -h . . . -h AfeCfc . 

Recalling that |c| = c-Sp, for every vector c G W, this implies that 
p=\a■^ = \al■^ + \a2■^, 

|ai • ^1 < Ai(|ci • ^1) -h .^. -I- Xkoi\cko ■ v\) =_(Ai -I h Xko)p and 

|o2 • v\ < Afc„+i(|cfeo+i • ^1) -h . . . -I- Afc(|cfe • rj\) = (Xko+i H h Xk)p ■ 

In particular, as Ai -I- • • • -I- Afe = 1, the inequalities above must be equalities. 
Thus we have proven that 



Up = |ai • 77I = Ap , 

where A = Ai -h . . . -h Xkg and Up G N. Note that, as ko < k and Ai > 0 (z G [A:]), 
it holds that A < 1. Hence we have that Up < p. 

In a similar way, using the form of C in 0 and the fact that g > p, N |= 
a < C and Ag ^ a < C, we may conclude that there is a 7 G such that 

nq = \ai • 7I = Ag . 

This in turn implies that ^ ^ or equivalently Up ■ q = Uq ■ p, contradicting 

our assumption that Up < p, Uq < q and p and q are different primes. We 
may therefore conclude that no such minimal number of variables m exists and 
consequently that the statement of the theorem holds. This completes the proof 
of the theorem. □ 

Propn.EI follows immediately from the above result , completing the proof of the 
non-existence of a finite equational axiomatization for the two- variable fragment 
of the equational theory of the algebra N. 

Remark 1. Using our results, it is easy to show that the reduct (N, V,-|-) of N 
is also not finitely based, and that the two-variable fragment of its equational 
theory has no finite equational axiomatization. 

As a further corollary of Thm. 0 we obtain that the equational theory of the 
algebra (N, V, -I-, 0, 1) is also not finitely based. To see this, note that whenever an 
equation holds in (N, V, -I-, 0, 1) and one side contains an occurrence of the symbol 
1, then so does the other side. Let E be an axiom system for (N, V, -b, 0, 1), and 
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let Eq denote the subset of E consisting of all the equations not containing 
occurrences of the constant 1. In light of the above observation, Eq is an axiom 
system for the reduct N of (N, V, + , 0, 1). Thus the existence of a finite basis for 
the algebra (N, V, +, 0, 1) would contradict Thm. E] In similar fashion, it is easy 
to prove that the two-variable fragment of the equational theory of the algebra 
(N, V, +, 0, 1) has no finite equational axiomatization. 



Remark 2. In we also provide an application of our main result to process 
algebra. More precisely, we show that trace equivalence has no finite w-complete 
equational axiomatization for the language of Basic Process Algebra [3 over a 
singleton alphabet, with or without the empty process. 

Acknowledgments: The anonymous referees provided useful comments and 
pointers to the large body of literature on the max-sum algebra and its applica- 
tions. 
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Abstract. A commutative complemented Kleene algebra of sets of (pos- 
itive) real numbers is introduced. For the subalgebra generated by finite 
unions of rational intervals a normal form is found. These are then ap- 
plied to the complementation problem for real-time automata. 



1 Introduction 

Computing with intervals has sometimes been a by-product whenever a quan- 
titative study of time and succession was of interest. Examples are constraint 
networks !DMP91| . timed automata |XdM| and logics of real-time |AH96] . It is 
trivial that finite unions of intervals form a boolean algebra and in USHHI it has 
been noticed that “periodic” unions of intervals form a boolean algebra, but the 
next and simple step of looking at a Kleene algebra of sets of positive reals has 
not yet (up to our knowledge) been an issue of interest. 

We make this step here by defining star with the usual least fixpoint con- 
struction based on addition of sets of numbers, hence transforming sets of real 
numbers into a Kleene algebra. This algebra naturally arises in the study of the 
so-called real-time automata (RTA) |DW9ti| . which are timed automata with a 
single clock which is reseted at each transition. In this model removing silent 
steps (a necessary step for transforming nondeterministic automata into de- 
terministic ones and then “complementing” them) involves computing stars of 
unions of intervals. Hence even at this lowest level of generalization from the 
untimed finite automata removing silent transitions jHIX;P9Mj is a problem. 

We solve this problem here by studying the sub-Kleene algebra generated by 
finite unions of intervals with rational bounds. We prove a normal form theorem 
for the elements of this subalgebra, result which is based on properties of integer 
division. We then find a normal form for circuits with silent transitions which 
allows complementation. Actually this is done by allowing clock constraints of 
the form x € X where X is no longer a finite union of intervals but some 
“periodic” union of intervals. 

We note that, though RTA cannot be used for modeling distributed real- 
time systems, they show sufficient theoretical interest as they are incomparable 

* This research was done during the author’s visit to TIFR, Bombay, supported by an 
extension of a UNU/IIST fellowship. 
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to event-clock a,iitomata, |A KH041lHH,S9?^ which are so far the largest determiniz- 
able subclass of timed automata. This situation occurs as RTA may accept lan- 
guages consisting of signals whose lengths are natural numbers while event-clock 
automata may not. 

The rest of the paper is divided as follows: in the next section we remind some 
definitions and state the problem of complementation of real-time automata. In 
the third section the Kleene algebra of sets of positive reals is introduced and 
the normal form theorem is proved. The fourth section is for the determinization 
construction and the last section is for short comments and directions of further 
study. 

2 Real-Time Automata 

A signal over a finite alphabet R is a function a : [0, e) — > V where e is a 
nonnegative number, function which has finitely many discontinuities, all of them 
being left discontinuities. Hence the domain of a signal a splits into finitely many 
intervals [ei_i,ei) on which a is constant. We denote by dom{a) the domain of 
cr, and Sig{V) the set of signals over V. 

For cTi,tT 2 S Sig{V) with dom(ai) = [0,6^) (i = 1,2) define their con- 
catenation (Ti;cr 2 = (7 as the signal with dom{a) = [0,ei -I- 62 ) and such that 
for t G [0,ei), a(t) = ai{t) and for [ei,ei -I- 62 ), cr{t) = a 2 (t — ei). Hence 
{Sig{V), ”, a^) becomes a noncommutative monoid whose unit is the signal 
with dom{(j() = [0,0). Then concatenation can be extended for sets of signals 
and gives rise to star: for S C Sig{V) put S* = UnGiN Here = {a^} and 

^ jjn. 

For the sequel, Int^^ denotes the set of intervals with positive rational or 
infinite bounds. Its elements are called rational intervals. 

Definition 1. A real-time automaton (RTA) over the alphabet V is a tuple 
A — (Q,\(X,i-,S,Qo,F) where Q is the (finite) set of states, S C Q x Q is the 
transition relation, Qo,F C Q are the sets 0 / initial, resp. final states, A : Q — > 
V is the state labeling function and l : Q — *■ Int^ is the interval labeling 
function; also call q an a-state iff \{q) = a. 

RTA work over signals: a run of length n is a sequence of states (gi)iG[ri] 
connected by i5, i.e. {qi-i, qi) G 6, Vi G [n]. A run is accepting iff it starts in Qo 
and ends in F. A run is associated to a signal cr with dom{a) = [0, e) iff there 
exist some “splitting” points 0 = ei < . . . < e„+i = e such that e^+i -e, e i{qi) 
and a{t) = X{qi) for all t G (ei,ei+i) and all i G [n]. Note that the “splitting” 
points must contain all the discontinuities but the reverse does not hold. 

The language of some RTA A is the set of signals associated to some ac- 
cepting run of A and is denoted L{A). Two RTA are equivalent iff they have 
the same language. Define then the class of timed recognizable languages as 
TRec{V) = {Sg Sig{V) \ 3A s.t. L{A) = E}. 

Proposition 1. The emptiness problem for RTA is decidable. 
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The proof relies on the algorithms for computing the accessible (or coaccessible) 
states which can be done in linear time (w.r.t. card{Q)). 

Definition 2. The set ETRE(y) of elementary timed regular expressions 

over V is defined by the rules E ::= 0 \ [a]i \ E + E \ E] E \ E* where the atoms 
[a]/ are all expressions with a G V and I G 

Semantics of elementary timed regular expressions is the following: 

I [a]/ 1 = {o' G Sig(V) | dom(cr) = [0, e), e G I and Vt G [0, e) a(t) = a} 
|E+F| = lEl U |F| \E;E\ = \E\;\F\ |E*| = |£;|* |0| = 0 

Define the class of timed regular languages as TReqiV) = iE G Siq(V) I 
3E G ETRE{V) such that \E\ = E}. 

Theorem 1 (Kleene theorem for RTA). TRec{V) = TReg{V). 

This theorem is proved in [Dim M|. Note that the regular expressions defined here 
are weaker than the ones of EDMSa. A class of regular expressions equivalent^ 
to fACM97| involve using sets of letters in the atoms, e.g. [A](i 2 ) where A CV, 
see [Dim99| . 

We want to show that the class of timed recognizable languages is closed 
under complementation, hence we need a subclass of RTA in which each word 
has a single run, for then complementation would be accomplished simply by 
complementation of the set of final states. 

Definition 3. A RTA A is language deterministic iff each signal in L{A) 
is associated to a unique run. A is stuttering -free iff it does not have time 
labels which contain 0 and transitions (q,r) with \{q) = A(r). A is state- 
deterministic iff initial states have disjoint labels and transitions starting in 
the same states have disjoint labels too, i.e. whenever rf^s and either r,sGQo 
or {q,r),{q,s)GS then A(r) A(s) or t(r) n i(s) = %. A is simply called deter- 
ministic iff it is both state- deterministic and stuttering-free. 

A first observation is that determinism implies language determinism while 
state-determinism itself does not. But a more important observation is that 
deterministic RTA are strictly less expressive than general RTA: Consider the 
language Lin = W : [0,n) — s- {a} | n S IN} of constant signals with integer 
length; it is accepted by the RTA in the figure 1(a). 

Proposition 2. L]n cannot be accepted by any stuttering-free RTA. 

The proof is based on the intuition that a stuttering-free RTA for Lin would need 
an infinite number of states. Note that this proof works for event-clock automata 
too [AKH941IHH,S98| . hence RTA and event-clock automata are incomparable. 

However there is no problem for building a RTA for the complement of Lin, 
as we see from the next figure. Also in this figure we find out that some stuttering 
RTA can still be transformed into stuttering-free RTA. 

^ This is actually the reason for denoting the expressions introduced here as elementary 
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Fig. 1. The automaton at a accepts Lk while the complement of Lk is accepted by 
the RTA at (b). The stuttering RTA at (c) is equivalent to the stuttering-free 
one at (d). 

Hence we discover the need of computing the “sum” of two intervals and the 
“star” of some interval, i.e. some operations that satisfy 

]R\{1}* = {1}*-P(0,1) and [2,3]* = [2,3] U [4,oo) 

relations which are suggested by figure 1. 



3 Operations with Subsets of the Real Numbers 

The powerset of the positive numbers 7^(lR>o) is naturally endowed with an 
operation of “concatenation”: it is addition extended over sets: 

X + Y = {x + y \ x G X,y €Y} for all X,Y CM 

whose unit is 0 = {0}. 

Moreover we can define star via the usual least fixpoint construction X* = 
UneiN where the multiples of X are defined as usual: OX = 0 and {n+l)X = 
nX + X. 

The following theorem can then be easily verified: 

Theorem 2. The structure 7^(lR>o) = ('P(IR>o), U, -I-, •*, 0, 0) is a commutative 
Kleene algebra, i.e. ('P(IR>o), +, 0) is a commutative monoid, + distributes over 
U and •* satisfies the following equations oz 



X + Y <Y ^ X* + Y <Y 


(1) 


ou (x-hA:*) < a:* 


(2) 


X* + Y* = {X + K)* -h {X* U Y*) 


(3) 



where X <Y denotes X UY = Y. 

Because a complement operation is available too: ~^X = lR>o\^, we actually 
get a commutative complemented Kleene algebra, i.e. a boolean algebra 
which is also a commutative Kleene algebra. 

Denote the sub- (commutative complemented Kleene) algebra gen- 

erated by Int^ in 7^(lR>o). 
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Definition 4. A set X G can be written in normal form (NF) ijf 

there exist finite unions of rational intervals Xi,X 2 € Int^ and some k G Q>q 
such that 

X = XiU{X2 + {k}*) (4) 

with the requirement that there exists some N gTX such that Xi C [0, and 
X 2 Q [Nk,(N + l)k). We call this N the bound of the NF. 

Normal forms are not unique: for the NF in the definition and some p G IN, the 
following expression: 

X = (Xi U {X 2 + {0, k,2k,...,{p- 1)A:})) U {X 2 + {pk} + {fc}*) 



is a NF too, but with bound N + p. 

Clearly a finite union of rational intervals X G Int^^ can always be put into 
NF. Also note that some NF X = XiU {X 2 + {A:}*) has A = 0 iff both Xi and 
X 2 are empty. 

Sometimes when applying different operations to NFs we might not be able 
to get very easily a new NF; instead, we might get a weak normal form, which 
is a decomposition like equation 21 but without the additional requirement on 
the existence of the bound. However we have the following: 

Lemma 1. Weak NFs can be transformed into NFs. 

The key result for NFs is the following: 

Theorem 3. Each X G /C(/nA^°) can be written in normal form. 

Proof. We must show that the result of any operation applied to NF can be put 
into NF. We first list some useful identities valid in 7^(IR) ES n ni: 

A** = A* (5) 

{XUY)* = X* + Y* (6) 

(A* -b Y)* = {0} U (A* + Y* + Y) (7) 

Also note the following ultimately periodicity property: 

Given n distinct positive rationals G Q>o we have that {oi, . . . , a„}* 
is ultimately periodic, i.e. there exist some finite set of rationals B and 
some rationals q,r G Q>g such that 

{oi,...,a„}* = HU({g}-b{r}*) (8) 



This result can be seen as a corollary of the normal form theorem of the regular 
languages over a one letter alphabet. 

Fix now two NFs A = AiU(A 2 -l-{A:}*) with bound M and A = YiU(A 2 +{^}*) 
with bound N and denote m = lcm{k,l). We then get the following form for 
AU A: 



Ai U A U 



%jk—\ 

u <-^2 



mjl — 1 

m)u u (A2 + M) I +{mr 



2=0 



2=0 



This is a WNF and Lemma Q helps transforming it into NF. 
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For X + V distributivity of + over U transforms it into: 



(Xi + Y,)u(Xi+Y2 + 10*) u (X2 + Fi + {k}*) U (X 2 + ^2 + {k}* + {/}*) 



An instantiation of identity El gives {A:}* + {Z}* = {k,l}*. The ultimately peri- 
odicity property 0 gives a NF for the last and thence we have above a union of 
(weak) NFs which we already know how to bring to NF. 

For X* we have two cases. The first one occurs when one of Xi and X2 
contains a nonpoint interval. Then the set X* is a finite union of rational in- 
tervals, so it is in NF. To prove this claim, note that for each nonpoint interval, 



say (a, 5] (that is 6 — a > 0), denoting toq 

mo-l 



, we have that (a, 6]* = 



OU 



u 



{ia, ib] U (moo, 00 ) since the choice of mo assures that (mg -I- l)a < mob. 



i—1 

Hence from the mg-th iteration the intervals start to overlap. This observation 
can be easily adapted to prove our claim. 

The second case for X* is when both Xi and X2 consist of point intervals. 
Applying identityElwe get that X* = Xl-\-{X2-\-{k}*)* . Then by the ultimately 
periodicity property 0Ai can be written into NF, so we concentrate on (X 2 -\- 
{/c}*)*. Therefore, by identities Q and El we get 



{X 2 + {A:}*)* = 0 U (A 2 + A* + {fc}*) = 0 U (A 2 + {X 2 U {A:})*) 

Now the ultimately periodicity property 0 tells us that (A 2 U {A:})* can be put 
into NF, so we can also find a NF for X* by the previous cases. 

For note first that ^[a, b) = [0, a) U [6, 00 ) (when b < 00 ), and ^{(a, 00 )} = 
{[0, a)}. Also, by De Morgan laws, -i(/i U . . . U In) = ~'/i D . . . n -•In is a NF 
and, for any NF, ^X = ->Ai n ^(^2 -I- {A:}*). Then, by distributivity of fi, we 
may restrict ourselves to the case Xi = [a, b) with a,b £ lR>g and k ^ 0 (the 
cases with other parentheses are similar). But since X 2 U [0, Nk) U [Nk k, 00 ) 
is a finite union of rational intervals (hence we know how to write NFs for the 
complement of it) the following is a NF for ^X : 



[0, a) U [b, Nk) U (^(^2 U [0, Nk) U [Nk k, 00 )) -I- {A;}*) □ 

Note that in firm a weaker version of this theorem is proved, roughly saying 
that the set of finite unions of n-dimensional NFs forms a boolean algebra. 

Though the theorem is based on the same technique that gives the normal 
form of regular languages over a one-letter alphabet it cannot be a simple corol- 
lary of that. Even if we restrict attention to the algebra generated by intervals 
with natural bounds, denote it IN/nt, this has two generators: the point set {1} 
and the nonpoint interval (0, 1). Neither of these generators alone may generate 
the whole IN/nA: {1} generates just sets with isolated points or complements 
of such sets (i.e. countable or co-countable sets) and hence does not generate 
(0,1), while (0,1) generates just finite unions of intervals (it cannot generate 

(0,1) + {in. 

One might also think that the result follows from Eilenberg’s theory of au- 
tomata with multiplicities |Fi74j. But this is not the case either since in that 
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work star is defined via some formal power series and one cannot prove, unless 
defining some suitable equivalence on power series, that e.g. [0, 1)* = [0,oo). 

Finally note the interesting relation which holds between the two generators 
of TNInt: 

(o,i)* = (ou(o,i)) + {ir 

At the end of this section we make a brief excursion into matrix theory. 
We construct, as in the Kleene algebra of matrices over 7^(IR>o) whose 

operations are defined as follows: 

U n 

(Aifc + Bkj ) 

A* = \ \ nA 

'^nGlN 

The star of matrix A can be computed by the usual Floyd- Warshall-Kleene 
algorithm [KlouTll tFlT^: we recursively define a sequence of n-|- 1 matrices A{k) 
{0 < k < n) with A(0) = A and 



A{k)ij — A{k — l)ij U {A{k — l)ik + {A{k — l)fcfc) + A[k — l)fej) (9) 



Proposition 3 ( [.Ei74j l. A{n) = A* for any matrix over P(IR>o). 



Corollary 1. If A is a matrix of normal forms then A* is a matrix of normal 
forms too which can be computed by the above algorithm. 

4 Determinization and Complementation of RTA 

Definition 5. An augmented real-time automaton (ARTA for short) over 
V is a tuple A = (Q, K, (5, A, t, Qoj where Q, Qq, F, 6 and A are the same as 
for RTA while c : Q — > (actually l gives a NF). 

ARTA work similarly to RTA: runs have the same definition and a signal 
a with dom{a) = [0, e) is associated to a run of length n iff there exist 0 = 
ti < ... < tn+i = e with — ti G L{qi) and aft) = \{qi) for all t G [tiAi+i) 
and all i G [n\. The emptiness problem is again decidable in linear time w.r.t. 
card{Q). Note that we need a preliminary step in which states q whose i{q) = 0 
are removed. Also the different notions of determinism also apply to ARTA 
with the same definitions; hence we will speak of state-deterministic ARTA and 
stuttering- free ARTA. 

The following theorem says that usage of NFs instead of just intervals do not 
add to the expressive power of RTA: 

Theorem 4. TReg{V) equals the class of languages accepted by ARTA. 
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The proof is very close to the one of Theorem ^ and is based on the following 
property of regular expressions: 

I [o](a,b)U((c,d) + {fe}*) I = l[®](a,6) + [o](c,d)i [®]{fc}l 

which can be easily extended to any atomic ETRE. 

Hence we may focus on the problem of complementing ARTA, since ARTA 
and RTA are equally expressive by Theorem 0 

Theorem 5. Each ARTA is equivalent to some deterministic ARTA. 

Proof. As a preliminary step, in the given ARTA we modify all states q in 
which 0 G L{q) by removing 0 from (.(g) and adding some new transitions, the 
whole process resembling very closely to the removal of e-transitions from finite 
automata. Also we assume that all states with empty time label have been 
removed. 

We first achieve stuttering freeness by removing all stuttering transitions 
between a-states, for some a G V, and repeating this for all the other letters in 
V. The idea is to find, for each pair of states (g, r) the set of positive numbers 
which are the duration of a signal that is associated to some run starting in g, 
ending in r and containing only a-states. For this we need to recursively add all 
the intervals of the states that may lie on such a run. This is one of the places 
where we apply the normal form theorem Eland the algorithm for computing the 
star of a matrix of sets of positive numbers. The formalization is the following: 

Start with some ARTA A = {Q, V, S, A, (, Qo, F) and number its transitions 
as (5 = {ti, . . . , tp} (denote U = {iui, outi)). Construct a matrix A whose elements 
are the sets of the a-labeled states: = A iff outi = iuj, \(puti) = a and 

L{outi) = X, otherwise Ay = 0. Then we add two more rows and columns to 
A (the p + 1-th and the p + 2-th) which intuitively record the time labels of 
initial, resp. final a-states: for j G [p] put Ap^ij = X for all j with iUj G Qo, 
X{inj) = a and i{inj) = X, otherwise Ap^ij = 0 and Aj^p ^.2 = X for all j with 
outj G F, \{outj) = a and i{outj) = X}; moreover put Ap 4 .i_p 4.2 = U{^ I 3g G 
Qo n F, A(g) = a and ((g) = X}. 

Then A* holds the lengths of all signals associated to runs consisting of a- 
states only. Hence (A*)y consists of the lengths of signals associated to runs 
starting in outi, ending in inj and consisting of a-states only. Also (A*)p+ij 
consists of the lengths of all runs that start in Qo, end in iUi and consist of 
a-states only. Similarly for (A*)i^p +2 and (A*)p+i_p+ 2 . Computation of A* is 
done by the Floyd- Warshall-Kleene algorithm ( 0 . Note here the importance of 
Corollary n the elements of A* are still NFs, hence they may be used for labeling 
some new states of an ARTA. 

Hence, while non-a-states will be preserved, the nonempty components of 
A* will replace all a-states: their time label will be (A*)y and they will be 
connected only to non-a-states. Formally, denote Q^a = {g G A | A(g) a} and 
Qa = {(*, j) I (A*)y ^ 0} and define B = {Q^a U Qa,V,S, \,T,To,Tf ) where 
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A(s) = A(s) for s G Q^a and A((z, j)) = a for (i,j) G Qa] 

7(s) = t(s) for s G Q^a and l((i, j)) = for {i,j) G Qa, 

To = {Q^a n Qo) U {(p + 1, j) G Qa} u {(p-|-l,P+2) I 7^ 0}j 

Tf = {Q^a n F) U {{i,p + 2) G Qa] u {(p+l,p+2) | ^ 0}; 

6 = {{irii,{i,j)) I X{in^)^a}U {{{i,j),outj) \ X{outj)^a}U 
U(<5n (Q^aXQ^a)) 

The proof that L{B) = L{A) relies on TheoremE] Note that by construction 
no two a-states are directly connected and by the exclusion of zeroes from inter- 
vals in the beginning all components of A* will not contain 0. Also, all transitions 
between non a-states are preserved, hence no stuttering transitions are added. 
This shows that after applying it for all letters in V we get a stuttering-free 
ARTA. 

The determinization construction is an adaptation of the subset construc- 
tion. Start with some ARTA B = {Q,V,6,X,l,Qo,F) assumed stuttering- free. 
If the time labels did not count then the states of the deterministic automaton 
were sets of identically state-labeled states and we would draw a transition from 
some S\ with A(S'i) = {a} to some S 2 with A(S' 2 ) = {b} iff S '2 = {r G Q | 
3q G Si s.t. (g,r) G 5}. Taking into account the time labels is done by splitting 
S 2 into several “smaller” sets such that the time labels of these parts give a 
partition of IR>o. 

Therefore we start with T, the set of triples {S, S', a) where oG R, S' CS CQ 
with X{S) = {a}, and X{{S, S' ,a)) = a. Intuitively the control passes through 
{S, S', a) iff in B the control may pass through some state in S' but not through 
any of the states in S\S' . Formally we associate to each U CQ with X{U) = {a} 
the set Tl{U) = {X GlC{Int^^) \3q gU s.t. i{q) = X} and define 

T{{S, S', a)) = IR >0 n (f] Tl{S')) n^(]jTl{S\ S')) 

where the usual conventions H ® = ^->0 and IJ 0 = 0 apply. Note that we put 
IR>o in front of l((S', S", a)) because otherwise we lose stuttering-freeness. 

It is also important to note that it is here where we need the result that NFs 
are closed under complementation, because we need to put T{{S, S' , a)) in NF 
and T{{S, S', a)) contains complementation. 

Though T is not what we need: it might still happen that T{S,S',a) n 
l{S,S",a) ^ 0, but only when l(S,S',a) = l(S,S",a) Hence we can define 
an equivalence on T as {S,S',a) ~ {S,S",a) iff 7(5', S", a) = 7(5, 5", a). The 
quotient set Tj^ is our desired set of states as we have that for any S CQ the 
time labels of the classes [5, S' , a] for all S' C S give a partition of IR>o. 

Hence we build C = (T/ ~, 5, A, 7, Tq, 7}), where A([5, 5',a]) = a and 

7([5, S' , a]) =7((5, S' , a)). S consists of transitions from [5, S', a] to each [U, U', 5] 
where U = {q€Q \ 3r G S' s.t. (r, q)€6 and X{q) = b} and U' C U. Case U' = % 
stands for the situation when the length of the current state in the signal is not 
in any of the sets from Tl{U). (Note how states [0,0, o] time-labeled with IR>o 
play the role of the trap states in dfa) . Moreover put 
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To = {[5,5',a]eQ|0^5CQo,aeF}U{[0,0,a] |Vgego,A( 9 )^a} 

T/ = {[5, S", o] e Q I S" n 7^ 0, a e V} 

The proof that C is equivalent to B follows the classical pattern. □ 

Theorem 6. TRec{V) is closed under complementation. 

Proof. This is a corollary of the above theorem: in the ARTA C constructed 
above, each signal is associated to a unique run that starts in Tq. Hence, the 
ARTA that accepts Sig{S) \ L{B) is the automaton obtained from C by comple- 
menting its set of final states. □ 

We actually have a normal form for RTAs resulting from this, namely that 
each RTA is equivalent to a RTA in which any circuit composed of stuttering 
transitions is a loop at a state labeled with a point interval, stuttering steps 
may only start from loop states and any chain of different stuttering steps has 
at most two transitions. An example for this is in figure 1(6). 

5 Conclusions and Further Work 

We have presented here a Kleene algebra of sets of positive numbers where 
elements have a finite representation and a class of real-time languages which is 
closed under complementation and is defined by some automata and by regular 
expressions. 

We briefly note two problems for further study: the first one concerns the 
possible application of this theory to removing of circuits of silent transitions in 
timed automata [IHI)(f PflR] . possibly using the Kleene theorem of [EEnS|. The 
second is whether a normal form can be found for unions of intervals which 
have any bounds (not necessarily rational) and are generated from finite sets of 
intervals by U, complementation, concatenation and star. 
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Abstract. In this paper we develop a new algorithm for deciding the 
winner in parity games, and hence also for the modal p-calculus model 
checking. The design and analysis of the algorithm is based on a notion of 
game progress measures: they are witnesses for winning strategies in par- 
ity games. We characterize game progress measures as pre-fixed points of 
certain monotone operators on a complete lattice. As a result we get the 
existence of the least game progress measures and a straightforward way 
to compute them. The worst-case running time of our algorithm matches 
the best worst-case running time bounds known so far for the problem, 
achieved by the algorithms due to Browne et ah, and Seidl. Our algo- 
rithm has better space complexity: it works in small polynomial space; 
the other two algorithms have exponential worst-case space complexity. 



1 Introduction 

A parity game is an infinite path-forming game played by two players, player O 
and player □, on a graph with integer priorities assigned to vertices. In order 
to determine the winner in an infinite play we check the parity of the lowest 
priority occurring infinitely often in the play: if it is even then player O wins, 
otherwise player □ is the winner. The problem of deciding the winner in parity 
games is, given a parity game and an initial vertex, to decide whether player O 
has a winning strategy from the vertex. 

There are at least two motivations for the study of the complexity of de- 
ciding the winner in parity games. One is that the problem is polynomial time 
equivalent to the modal /i-calculus model checking |2|, hence developing better 
algorithms for parity games may lead to better model checking tools, which is 
a major objective in computer aided verification. The other is that the problem 
has an interesting status from the point of view of structural complexity theory. 
It is known to be in NP n co-NP Pj (and even in UP n co-UP |0|), and hence 
it is very unlikely to be NP-complete, but at the same time it is not known 
to be in P, despite substantial effort of the community (see 0 [H [El iZOj and 
references therein). 
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Progress measures 0 are decorations of graphs whose local consistency guar- 
antees some global, often infinitary, properties of graphs. Progress measures have 
been used successfully for complementation of automata on infinite words and 
trees 13 E]; they also underlie a translation of alternating parity automata on 
infinite words to weak alternating automata cni. A similar notion, called a sig- 
nature, occurs in the study of the modal ^-calculus El. Signatures have been 
used to prove memoryless determinacy of parity games 

Our algorithm for parity games is based on the notion of game parity progress 
measures; Walukiewicz m calls them consistent signature assignments. Game 
parity progress measures are witnesses for winning strategies in parity games. 
We provide an upper bound on co-domains of progress measures; this reduces the 
search space of potential witnesses. Then we provide a characterization of game 
parity progress measures as pre-fixed points of certain monotone operators on a 
finite complete lattice. This characterization implies that the least game parity 
progress measures exist, and it also suggests an easy way to compute them. 

The modal /i-calculus model checking problems is, given a formula cp of the 
modal /r-calculus and a Kripke structure K with a set of states S, to decide 
whether the formula is satisfied in the initial state of the Kripke structure. The 
problem has been studied by many researchers; see for example OSD El ^ 
and references therein. The algorithms with the best proven worst-case running 
time bounds so far are due to Browne et al. Seidl m- Their worst-case 

running times are roughly 0{m ■ and 0{m ■ respectively, 

where n and m are some numbers depending on ip and AT, such that n < jS”! • |</?|, 
m < \K\ ■ \ip\, and d is the alternation depth of the formula tp. 

In fact, number n above is the number of vertices in the parity game ob- 
tained from the formula and the Kripke structure via the standard reduction of 
the modal /r-calculus model checking to parity games, and m is the number of 
edges in the game graph; see for example Moreover, the reduction can 

be done in such a way that the number of different priorities in the parity game 
is equal to the alternation depth d of the formula. Our algorithm has worst- 
case running time 0(m ■ and it can be made to work in time 

0(to • , hence it matches the bounds of the other two algorithms. 

Moreover, it works in space 0{dn) while the other two algorithms have expo- 
nential worst-case space complexity. Our algorithm can be seen as a generic 
algorithm allowing many different evaluation policies; good heuristics can po- 
tentially improve performance of the algorithm. However, we show a family of 
examples for which worst-case running time occurs for all evaluation policies. 

Among algorithms for parity games it is worthwhile to mention the algorithm 
of McNaughton H2| and its modification due to Zielonka m In the extended 
version of this paper we show that Zielonka’s algorithm can be implemented 
to work in time roughly 0(m ■ {n/dYY and we also provide a family of exam- 
ples for which the algorithm needs this time. Zielonka’s algorithm works in fact 
for games with more general Muller winning conditions. By a careful analysis 
of the algorithm for games with Rabin (Streett) winning conditions we get a 
running time bound 0(m ■ n^^/(fc/2)^), where k is the number of pairs in the 
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Rabin (Streett) condition. The algorithm also works in small polynomial space. 
This compares favourably with other algorithms for the linear-time equivalent 
problem of checking non-emptiness of non-deterministic Rabin (Streett) tree au- 
tomata nnmini, and makes it the best algorithm known for this NP-complete 
(co-NP-complete) 0 problem. 

2 Parity Games 

Notation: For all n G N, by [n] we denote the set {0, 1, 2, . . . , n — 1}. If (V, E) 
is a directed graph and W C V, then by (V,E) \ W we denote the subgraph 
(W, E) of {V, E), where F = EnW"^. [Notation] □ 

A parity graph G = (V, E,p) consists of a directed graph (V, E) and a priority 
function p ■. V ^ [d], where d G N. A parity game E = (R, E,p, (R<>, Vn)) consists 
of a parity graph G = (V,E,p), called the game graph of E, and of a partition 
(RojRn) of the set of vertices V. For technical convenience we assume that all 
game graphs have the property that every vertex has at least one out-going 
edge. We also restrict ourselves throughout this paper to games with finite game 
graphs. 

A parity game is played by two players: player O and player □, who form 
an infinite path in the game graph by moving a token along edges. They start 
by placing the token on an initial vertex and then they take moves indefinitely 
in the following way. If the token is on a vertex in V<y then player O moves the 
token along one of the edges going out of the vertex. If the token is on a vertex 
in Vn then player □ takes a move. In the result players form an infinite path 
7T = (ui, V 2 ,vs, . . .) in the game graph; for brevity we refer to such infinite paths 
as plays. The winner in a play is determined by referring to priorities of vertices 
in the play. Let Inf(7r) denote the set of priorities occurring infinitely often in 
(^p{vi) , p{v 2 ) , pivs) , . . . ). A play tt is a winning play for player O if min (lnf(7r)) 
is even, otherwise tt is a winning play for player □. 

A function a \ V<y ^ V is a. strategy for player O if (u,(t(u)) G E for all 
V G V<y. A play tt = (ui, U2, U3, . . .) is consistent with a strategy cr for player O if 
vi+i = cr(u^), for all ^ G N, such that vi G Vo- A strategy cr is a winning strategy 
for player O from set IF C R, if every play starting from a vertex in IF and 
consistent with cr is winning for player O. Strategies and winning strategies are 
defined similarly for player □. 

Theorem 1 (Memoryless Determinacy 113] ]) 

For every parity game, there is a unique partition (IFo, IFn ) of the set of vertices 
of its game graph, such that there is a winning strategy for player O from IFo, 
and a winning strategy for player □ from IFn . 

We call the sets and IRn the winning sets of player O and player □, respec- 
tively. The problem of deciding the winner in parity games is, given a parity 
game and a vertex in the game graph, to determine whether the vertex is in the 
winning set of player O. 
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Before we proceed we mention a simple characterization of winning strategies 
for player O in terms of simple cycles in a subgraph of the game graph associated 
with the strategy. We say that a strategy a for player O is closed on a set W C V 
if for all V G W, we have: 

— \i V G Vo then a(v) G W, and 

— iiv gVu then (u, w) G E implies w G W. 

Note that if a strategy a for player O is closed on W then every play starting 
from a vertex in W and consistent with cr stays within W. 

If cr is a strategy for player O then by we denote the parity graph (]/, Ea,p) 
obtained from game graph G = {V,E,p) by removing from E all edges (v,w) 
such that V G Vo and a(v) ^ w. 

We say that a cycle in a parity graph is an i-cycle if i is the smallest priority 
of a vertex occurring in the cycle. A cycle is an even cycle if it is an i-cycle for 
some even i, otherwise it is an odd cycle. The following proposition is not hard 
to prove. 

Proposition 2 Let cr be a strategy for player O closed on W. Then cr is a 
winning strategy for player O from W if and only if all simple cycles in G^ \ W 
are even. 

3 Small Progress Measures 

In this section we study a notion of progress measures. Progress measures play 
a key role in the design and analysis of our algorithm for solving parity games. 

First we define parity progress measures for parity graphs, and we show that 
there is a parity progress measure for a parity graph if and only if all cycles in 
the graph are even. In other words, parity progress measures are witnesses for 
the property of parity graphs having only even cycles. The proof of the ‘if’ part 
also provides an upper bound on the size of the co-domain of a parity progress 
measure. Then we define game parity progress measures for parity games, we ar- 
gue that they are witnesses for winning strategies for player O, and we show that 
the above-mentioned upper bound holds also for game parity progress measures. 

Notation: If a G is a d-tuple of non-negative integers then we number 
its components from 0 to d — 1, i.e., we have a = (oq, oi, . . . , Od-i)- When 
applied to tuples of natural numbers, the comparison symbols <, <, =, yf, >, and 
> denote the lexicographic ordering. When subscripted with a number j G N 
(e.g., <i,=i,>i), they denote the lexicographic ordering on W applied to the 
arguments truncated to their first i components. For example, (2, 3, 0,0) >2 
(2, 2, 4, 1), but (2, 3, 0, 0) =0 (2, 2, 4, 1). [Notation] □ 

Definition 3 (Parity progress measure) 

Let G = (y,E,p: V — > [d]) be a parity graph. A function g : P ^ is a parity 
progress measure for G if for all (v,w) G E, we have g(v) >p(v) g(w), and the 
inequality is strict if p(v) is odd. [Definition 0 □ 
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Proposition 4 If there is a parity progress measure for a parity graph G then 
all cycles in G are even. 

Proof: Let g : V ^ be a parity progress measure for G. For the sake 
of contradiction suppose that there is an odd cycle vi,V 2 , ■ ■ ■ ,ve in G, and let 
i = p{vi) be the smallest priority on this cycle. Then by the definition of a 
progress measure we have £<(ui) >i q{v 2 ) >i q{v 2 ) >i ■ ■ ■ Q{vi) >i q(vi), and 
hence g{vi) >i g{vi), a contradiction. [Proposition 0 ■ 

If G = (y,E,p : V ^ [d]) is a parity graph then for every i £ [d], we write Vi to 
denote the set p~^(i) of vertices with priority i in parity graph G. Let rii = \Vi\, 
for all i £ [d]. Define Mq to be the following finite subset of N”* : if d is even then 

Mg = [1] X [ni + 1] X [1] X [ri 3 + 1] X • • • X [1] x [rid_i + 1]; 

for odd d we have • • • x [nd -2 + 1] x [1] at the end. In other words, Mq is the finite 
set of d-tuples of integers with only zeros on even positions, and non-negative 
integers bounded by \Vi\ on every odd position i. 

Theorem 5 (Small parity progress measure) 

If all cycles in a parity graph G are even then there is a parity progress measure 
g : V —>■ Mg for G. 

Proof: The proof goes by induction on the number of vertices in G = (V) E^p : 
V [d]) . For the induction to go through we slightly strengthen the statement of 
the theorem: we additionally claim, that if p(v) is odd then g(v) >p(v) (Oj • ■ ■ j 0)- 
The statement of the theorem holds trivially if G has only one vertex. 

Without loss of generality we may assume that VqU Vi yf 0; otherwise we can 
scale down the priority function of G by two, i.e., replace the priority function 
p by the function p — 2 defined by (p — 2) (u) = p(v) — 2, for all v £ V. Suppose 
first that Vq yf 0. By induction hypothesis there is a parity progress measure 
g : (P\Vb) ^ for the subgraph G [ (l/\Vb). Setting p(u) = (0, ... ,0) e Mg, 
for all V £ Vo, we get a parity progress measure for G. 

Suppose that Vb = 0 then Vi y^ 0. We claim that there is a non-trivial 
partition (Wi, IF 2 ) of the set of vertices V, such that there is no edge from Wi 
to W 2 in G. 

Let u £ Vi', define G C P to be the set of vertices to which there is a non- 
trivial path from m in G. If G = 0 then Wi = {u} and W 2 = P \ {u} is a desired 
partition of fo. If G yf 0 then Wi = U and W 2 = V\U is a desired partition. The 
partition is non-trivial (i.e., V\U yf 0) since u ^ U: otherwise a non-trivial path 
from u to itself gives a 1-cycle because Vq = 0, contradicting the assumption 
that all cycles in G are even. 

Let Gi = G \ Wi, and G 2 = G [ W 2 be subgraphs of G. By induction 
hypothesis there are parity progress measures gi : Wi ^ Mg^ for Gi, and 
g 2 : W 2 Mg 2 for G 2 . Let n' = 1 1^ n Wi \ , and let n" = | n W 2 1 , for i G [d] . 
Clearly m = n'i + n", for all i £ [d]. Recall that there are no edges from Wi to 
W 2 in G. From this and our additional claim applied to gi it follows that the 
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function g= Qi^{q2 + (0, n'l, 0, F Mq is a parity progress measure 
for G. [Theorem 0 ■ 

Let r = (y, E,p, (K>, Vn)) be a parity game and let G = {V, E,p) be its game 
graph. We define Mq to be the set Mq U {T }, where T is an extra element. We 
use the standard comparison symbols (e.g., <, =, >, etc.) to denote the order on 
Mq which extends the lexicographic order on Mq by taking T as the biggest 
element, i.e., we have m < T, for all m G Mq. Moreover, for all m G Mq and 
i G [d\, we set m <i T, and T =i T . If g : V ^ Mq and {v, w) G E then 

by Prog(p, n,w) we denote the least m G Mq, such that m >p(v) &{w), 
and if p{v) is odd then either the inequality is strict, or m = g{w) = T. 



Definition 6 (Game parity progress measure) 

A function g ■. V ^ Mq is a game parity progress measure if for all v gV , we 
have: 

— if V G V<y then g{v) >p(„) Prog(p, n,u>) for some (v,w) G E, and 

— if V G Vn then g{v) >p(v) Prog(g,v,w) for all (v,w) G E; 

by ||(?|| we denote the set { n G P : g{v) T }. [Definition ^ □ 

For every game parity progress measure g we define a strategy g \ V<y ^ V for 
player O, by setting g{v) to be a successor w of v, which minimizes g{w). 



Corollary 7 If p is a game parity progress measure then p is a winning strategy 
for player O from |[p|[. 

Proof: Note first that g restricted to [( £i|| is a parity progress measure on Gg [ |[ £i|| . 
Hence by Proposition 0] all simple cycles in G^ [|(p|| are even. 

It also follows easily from definition of a game parity progress measure that 
strategy g is closed on || £i|[ . Therefore, by Proposition|2| we get that g is a winning 
strategy for player O from [[gi[[. [Corollary 0 ■ 



Corollary 8 (Small game parity progress measure) 

There is a game progress measure g : V ^ Mq such that [|p|| is the winning set 
of player O. 

Proof: It follows from Theorem0that there is a winning strategy cr for player O 
from her winning set Wo, which is closed on Wo- Therefore by Proposition O all 
cycles in parity graph G„ \ Wo are even, hence by Theorem El there is a parity 
progress measure g : Wo Mq for G„ \ Wo- It follows that setting g{v) = T 
for all n G P \ Wo, makes g a game parity progress measure. [Corollary 0 ■ 
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4 The Algorithm 

In this section we present a simple algorithm for solving parity games based 
on the notion of a game parity progress measure. We characterize game parity 
progress measures as (pre-)fixed points of certain monotone operators in a finite 
complete lattice. By Knaster-Tarski theorem it implies existence of the least 
game progress measure /i, and a simple way to compute it. It then follows from 
Corollaries El and □ that |j/j,|| is the winning set of player O. 

Before we present the algorithm we define an ordering, and a family of 
Lift (•,?;) operators for all z) S V, on the set of functions V ^g- Given 
two functions fj,, g : V ^ define g, ^ g to hold if g{v) < g{v) for all 

V G V. The ordering relation C gives a complete lattice structure on the set of 
functions V Mq. We write g \Z g it g tZ g, and g ^ g. Define Lift(£), u) for 

V gV as follows: 



The following propositions follow immediately from definitions of a game parity 
progress measure, and of the Lift(-,z;) operators. 

Proposition 9 For every v G V, the operator Lift(-,z;) is C-monotone. 

Proposition 10 A function g : V ^ Mq is a game parity progress measure, if 
and only if is it is a simultaneous pre-fixed point of all Lift(-,z;) operators, i.e., 
if Lift(gi, v) iZ g for all v gV. 

From Knaster-Tarski theorem it follows that the Q-least game parity progress 
measure exists, and it can be obtained by running the following simple procedure 
computing the least simultaneous (pre-)fixed point of operators Lift(-,z;), for all 



ProgressMeasureLifting 
g:=XvG K.(0,...,0) 

while g \Z ljiit{g, v) for some z; € K do ^ := Lift(^, f) 



Theorem 11 (The algorithm) 

Given a parity game, procedure ProgressMeasureLifting computes winning 
sets for both players and a winning strategy for player O from her winning set; 
it works in space 0{dn), and its running time is 




vGV. 




where n is the number of vertices, m is the number of edges, and d is the 
maximum priority in the parity game. 
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Proof: The result of running ProgressMeasureLifting on a parity game is 
the C-least game progress measure fj,. Let Wo be the winning set of player O. 
By minimality of n and by Corollary 0 it follows that Wo Q ||/r||- Moreover, 
Corollary 13 implies that ^ is a winning strategy for player O from ||^||, and 
hence by Theoremn]we get that ||/r|| C Wo, i.e., ||^|| = Wo- 

Procedure ProgressMeasureLifting algorithm works in space 0{dn) be- 
cause it only needs to maintain a d-tuple of integers for every vertex in the game 
graph. The Lift (•,?;) operator, for every v G V, can be implemented to work 
in time 0(d • out-deg(n)) , where out-deg(n) is the out-degree of v. Every ver- 
tex can be “lifted” only \Mg\ many times, hence the running time of procedure 
ProgressMeasureLifting is bounded by 

o(^ ^ d ■ out-deg(n) • = 0(^dm ■ \Mg\)- 

vev 

To get the claimed time bound it suffices to notice that 

/ n \ ld/2J 

imgi= n («2*-i+i)< (^) , 

because -I- 1) < n if 0 for all i G [d], which we can assume 

without loss of generality; if = 0 for some i G [d] then we can scale down the 
priorities bigger than i by two. [Theorem EJ ■ 

Remark: Our algorithm for solving parity games can be easily made to have 




as its worst-case running time bound, which is better than 0{m- (n/[d/2j)^'^^^^ 
for even d, and for odd d if d > 21ogn. If {n 2 i-i + 1) < n/2 then the 

above analysis gives the desired bound. Otherwise (^ 2 i + 1) < «/2 -P 1. 

In this case it suffices to run procedure ProgressMeasureLifting on the dual 
game, i.e., the game obtained by scaling all priorities up by one, and swapping 
sets in the (V<>, Vn) partition. The winning set of player O in the original game 
is the winning set of player □ in the dual game. [Remark] □ 

Note that in order to make ProgressMeasureLifting a fully deterministic 
algorithm one has to fix a policy of choosing vertices at which the function 
p, is being “lifted”. Hence it can be considered as a generic algorithm whose 
performance might possibly depend on supplying heuristics for choosing the 
vertices to lift. Unfortunately, as we show in the next section, there is a family 
of examples on which the worst case performance of the algorithm occurs for all 
vertex lifting policies. 
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5 Worst-Case Behaviour 

Theorem 12 (Worst-case behaviour) 

For all d,n G N such that d < n, there is a game of size 0{n) with priorities not 
bigger that d, on which procedure ProgressMeasureLif ting performs at least 
([n/c?]) many lifts, for all lifting policies. 

Proof: We define the family of games Hty, for all £,b G N. The game graph 
of consists of £ “levels”, each level contains b “blocks”. There is one “odd” 
level, and £ — 1 “even” levels. 

The basic building block of the odd level is the following subgraph. 




The numbers in vertices are their priorities. The odd level of Hiy consists of b 
copies of the above block assembled together by identifying the left-hand vertex 
with priority 2£ of the o-th block, for every a€{l,2,...,5— 1}, with the right- 
hand vertex with priority 2£ of the (a -I- l)-st block. For example the odd level 
of is the following. 




In all our pictures vertices with a diamond-shaped frame are meant to belong 
to Ify, he., they are vertices where player O moves; vertices with a box-shaped 
frame belong to Vn- Some vertices have no frame; for concreteness let us as- 
sume that they belong to Ify, but including them to Vn would not change our 
reasoning, because they all have only one successor in the game graph of Hty. 

The basic building block of the k-th even level, for k G {1,2, ...,£ — 1}, is 
the following subgraph. 



2k -I 




Every even level is built by putting b copies of the above block together in a 
similar way as for the odd level. 

To assemble the game graph of Hiy we connect all £ — 1 even levels to 
the odd level, by introducing edges in the following way. For every even level 
k G {1, 2, ...,£— 1}, and for every block a S (1, 2, . . . , 5}, we introduce edges in 
both directions between the box vertex with priority 2£ — 1 from the a-th block 
of the odd level, and the diamond vertex with priority 2k from the a-th block of 
the fc-th even level. See Figure [Dfor an example: the game H4 3. 

Claim 13 Every vertex with priority 2£ — 1 in game is lifted (6-1- 1)^ many 
times by procedure ProgressMeasureLifting. 
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Fig. 1. The game i?4,3- 



Proof: Note that in game player O has a winning strategy from all vertices 
in even levels, and player □ has a winning strategy from all vertices in the odd 
level; see Figure Q1 Therefore, the value of the least progress measure in all 
vertices with priority — 1 is T & Hence it suffices to show that every 

vertex with priority — 1 can be lifted only to its immediate successor in the 
order on Then it is lifted | = (6 + 1)^ many times, because 

Mnt , = [1] X [5 + 1] X [1] X [6 + 1] X • • • X [5 + 1] X [1] . 

' V ' 

2i-\-l components 

Let r; be a vertex with priority 2£ — 1 in the odd level of and let 
u> be a vertex, such that there is an edge from u to in the game graph of 
Then there is also an edge from w to u in the game graph of see 
Figure Q Therefore, function ^ maintained by the algorithm satisfies < 

n{v), because w is a diamond vertex with even priority, so Prog(/r,w,u) =p(w) 
n{v), and (Prog(/i, w, u)) . = 0 for all i > p{w). It follows that Lift(-,u) operator 
can only lift /i(u) to the immediate successor of fj,(y) in the order on Mh^^, 
because the priority of u is 2f — 1 . [Claim El ■ 

Theorem follows from the above claim by taking the game Ff[d/ 2 j,|'n/d] • 

[Theorem r^l ■ 



6 Optimizations 

Even though procedure ProgressMeasureLifting as presented above admits 
the worst-case performance, there is some room for improvements in its running 
time. Let us just mention here two proposals for optimizations, which should be 
considered when implementing the algorithm. 
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One way is to get better upper bounds on the values of the least game parity 
progress measure than the one provided by Corollary 0 taking into account the 
structure of the game graph. This would allow to further reduce the “search 
space” where the algorithm is looking for game progress measures. For example, 
let G-* be the parity graph obtained from the game graph G by removing all 
vertices with priorities smaller than i. One can show that if u G |j/r|| for the 
least game progress measure /r then for odd i’s the i-th component of fj,{v) is 
bounded by the number of vertices of priority i reachable from v in graph G-*. 
It requires further study to see whether one can get considerable improvements 
by pre-computing better bounds for the values of the least game parity progress 
measure. 

Another simple but important optmization is to decompose game graphs into 
maximal strongly connected components. Note that every infinite play eventually 
stays within a strongly connected component, so it suffices to apply expensive 
procedure for solving parity games to the maximal strongly connected compo- 
nents separately. In fact, we need to proceed bottom up in the partial order of 
maximal strongly connected components. Each time one of the bottom compo- 
nents has been solved, we can also remove from the rest of the game the sets of 
vertices from which respective players have a strategy to force in a finite number 
of moves to their so far computed winning sets. 

The above optimizations should considerably improve performance of our 
algorithm in practice, but they do not, as such, give any asymptotic worst-case 
improvement: see the examples He^b from Sectional 
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Abstract. We investigate self-testing programs with relative error by 
allowing error terms proportional to the function to be computed. Until 
now, in numerical computation, error terms were assumed to be either 
constant or proportional to the p-th power of the magnitude of the input, 
for p € [0, 1). We construct new self-testers with relative error for real- 
valued multi-linear functions defined over hnite rational domains. The ex- 
istence of such self-testers positively solves an open question in [KMS99j . 
Moreover, our self-testers are very efficient: they use few queries and sim- 
ple operations. 



Keywords — Program verification, approximation error, self-testing programs, 
robustness and stability of functional equations. 

1 Introduction 

It is not easy to write a program P to compute a real- valued function /. By 
definition of floating point computations, a program P can only compute an 
approximation of /. The succession of inaccuracies in computational operations 
could be significant. Moreover once P is implemented it is more difficult to verify 
its correctness, ie. that P(x) is a good approximation of f(x) for all valid inputs 
X. In a good approximation one would like the significant figures to be correct. 
This leads us to the notion of relative error. If a is a real number and d is its 
approximation, then the quantity 9 = |a — a|/a is called the relative error of 
the approximation. 

In recent years, several notions were developed to address the software cor- 
rectness problem. Here we focus on the following scenario. First, the program 
to be tested is viewed as a black box, i.e. we can only query it on some inputs. 
Second, we want a very efficient testing procedure. In particular, a test should 
be more efficient than any known correct implementation. For exact computa- 
tion, program checking self-testing programs |BLIi.93] . and self- 

correcting programs pjLR93l|Opn] were developed in the early 90 ’s. A program 
checker for / verifies whether the program P computes / on a particular input 
x; a self-tester for / verifies whether the program P is correct on most inputs; 
and a self-correcting program for / uses a program P, which is correct on most 
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RAND2 No. 21726, and Franco-Hungarian bilateral project Balaton No. 99013. 
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inputs, to compute / correctly everywhere with high probability. Let us insist 
that checkers, self-testers and self-correctors can only use the program P as a 
black box, and are required to be different and simpler than any known imple- 
mentation of / (see for a formal definition). In this context, results on 

testing linear functions and polynomials have theoretical implications for prob- 
abilistically checkable proofs IALM+ 9^ and in approximation theory. For 

a survey see HOT). 

Let us recall the problem of linearity testing which has been fundamental 
in the development of testers EEESa. Given a program P which computes a 
function from one Abelian group G into another group, we want to verify that 
P computes a homomorphism on most inputs in G. The Blum-Luby-Rubinfeld 
linearity test is based on the linearity property f{x + y) = f{x) + f{y), for all 
x,y € G, which is satisfied when / is a homomorphism. The test consists in 
verifying the previous linearity equation on random instances. More precisely, it 
checks for random inputs x,y G G that P{x+y) = P{x)+P{y). If the probability 
of failing the linearity test is small, then P computes a homomorphism except 
on a small fraction of inputs. This property of the linearity equation is usually 
called the robustness of the linearity equation. This term was defined in [K,Sf)fi| 
and studied in |EiE22|. The analysis of the test is due to Coppersmith |Gop8y| . 
It consists in correcting P by querying it on few queries. Let g be the function 
which takes at x the majority of the votes {P{x + y) — P{y)), for all y G G. 
When the failure probability in the linearity test is small, majority turns out to 
quasi-unanimity, g equals P on a large fraction of inputs, and g is linear. This 
idea of property testing has been recently formalized and extended to testing 
graph properties in [GGR96L IGR97j . 

These notions of testing were exten ded to ap proximate computation with 
abs olute erro r for self-testers/correctors |CI.R,+9lj and for checkers ^HGG93j. 
In |GLI!+91j Gemmel et al. studied only functions defined over algebraically 
closed domains. Ergiin, Ravi Kumar, and Rubinfeld initiated and solved 

the problem of self-testing with absolute error for linear functions, polynomials, 
and additive functions defined over rational domains. Rational domains were first 
considered by Lipton |Lip91| and these are the sets = {i/s : \i\ < n,i G Z}, 
for some integer n > 1 and real s > 0. In these past works the absolute error of 
the approximation a of a is defined by £ |a — a|. In this approximate context 
the linearity testing problem consists now in verifying that a given program P 
computes approximately a real linear function over T>n,s- To allow absolute error 
in the computation of P, the approximate linearity test consists in verifying that 
\P{x + y) — P{x) — P{y) \ < £, for random x,y G T>n,s and some fixed £ > 0. Then 
the analysis is very similar to that of the exact case. Since the majority is not 
adapted to approximate computation, it is replaced by the median. Moreover 
both the closeness of g to P and the linearity of g are approximated. Therefore 
we need a second stage which consists in proving the local stability of the linearity 
equation for absolute error, that is, every function satisfying \f{x + y) — f{x) — 
f{y)\ < e, for all x,y G T>n,s, is close to a perfectly linear function. This part is a 
well-studied problem in mathematics for several kinds of error terms when x and 
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y describe a group like Z . It corresponds to the study of Hyers-Ulam stability. The 
stability problem is due to Ulam and was first solved in the absolute error case in 
1941 by Hyers For a survey of Hyers-Ulam stability see ESZniEEMI- 

Using elegant techniques of Hyers-Ulam stability theory, Kiwi et al. extended 
a part of ’s work for non-constant error terms . They considered 

error terms proportional in every input x to \x\^, for any 0 < p < 1, that 
is, they considered computations where inaccuracies depend on the size of the 
values involved in the calculations. This model corresponds to many practical 
situations. Among other things, they show how self-testing whether a program 
approximately computes a linear function for these errors terms. For this they 
proved the local stability of the linearity equation using its stability on the 
whole domain Z using techniques based on an argument due to Skof jSko83j . 
The robustness part is similar to absolute error case, but the set of voters in the 
median defining g{x) depends on x since big voters may induce big erro rs for 
small X. Since the linearity equation is unstable for the case p = 1 IHS92I , their 
work did not lead to self-testers either for the case p = 1, which corresponds to 
linear error terms, or for relative error terms (be. proportional to the function 
to be computed) [KMS99I Sect. 5]. 

In this paper, we investigate the study of approximate self-testing with rel- 
ative error. Relative error is one of the most important notions in numerical 
computation. Proving that a program is relatively close to its correct implemen- 
tation is the challenge of many numerical analysts. We hope to contribute to 
make self-testers more adapted to numerical computation. In this setting self- 
testing consists in the following task: 

Problem. Given a class of real-valued functions T defined over a finite domain 
D, and some positive constants ci, C 2 , <5i, i52, we want a simple and efficient prob- 
abilistic algorithm T such that, for any program P : D W, which is an oracle 
for T: 

— if for some f € T, [\P'^{x) — f{x)\ > ci|/(a;)|] < 5i, then T outputs 

PASS with high probability; 

— if for all f G P, [\P'^{x) — f{x)\ > C 2 |/(a:)|] > 62 , then T outputs 

FAIL with high probability. 

We give a positive answer to this problem for the set of real-valued d-linear 
functions, for any integer d > 1. This is the first positive answer to this problem 
in the literature. In particular, we solve some problems in |KMS99| that were 
mentioned previously. For the sake of brevity and clarity we will consider func- 
tions defined over positive integer domains T>f = {i€N:l<i< n}, for some 
even integer n > 1. But all of our results remain valid for more general rational 
domains. 

First we define in Theorem El a new probabilistic test for linear functions. 
It is constructed from a new functional equation for linearity which is robust 
(Theorem Ej) and stable for linear error terms (Theorem^. We use it to build 
an approximate self-tester for linear functions which allows linear error terms 
(Theorem El). From it we are able to construct the first approximate self-tester 
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with relative error in the sense of the stated problem (Theorem 0. This self- 
tester is generalized for multi-linear functions in Theorem Q using an argument 
similar to that in EM. These self-testers are quite surprising since they only 
use comparisons, additions, and multiplications by powers of 2 {i.e. left or right 
shifts in binary representation) . Moreover the number of queries and operations 
does not depend on n. 

2 Linearity 

The linearity test of |KMS99| is based on the linearity equation f{x + y) = 
f{x) + f{y) which is robust and stable for error terms proportional to |a;|^, 
where 0 < p < 1, but unstable when p = 1. More precisely they showed: 

Theorem 1 ( ||KMS99li Theorem 2]) Let 0 < S < 1, 9 > 0, and 0 < p < 1. If 
P : ^ K js such that 

Pr [\P{x + y) — P{x) — P{y)\ > 0Max{x^,2/^}] < S, 

then there exists a linear function I : — > K such that for Cp = (l-|-2^')/(2— 2^), 

Pr [|P(a:) - l{x)\ > 17Cp9xP] < 0(VS). 

xeT>i 

(If p = 0 then the latter inequality holds with 0{S) in its RHS.) 

Remark. In this theorem and in the rest of the paper we only consider uniform 
probabilities. 

For p = 1 the statement of this theorem does not hold anymore. Let 0 > 0 
and f{x) 0a;log2(a:: + 1), for all a: > 0. In |HS92j it is shown that / satisfies 
\f{x + y) — f{x) — f{y)\ < 29 Max {x, p}, for all a;, p > 0, but / is not close to 
any linear function. Hence either the test or the error term has to be modified, 
but both can not be kept. In [KMS99| the linearity test was unchanged and error 
terms proportional to were considered, for some 0 < p < 1. In this paper we 
change the test but keep a linear error term. 

All results of this paper are based on the following theorem. It defines a 
probabilistic test such that the distance of any program to linear functions is 
upper bounded by a constant times its failure probability on it. Here the distance 
is not yet relative but it is defined for a linear error term. Let Med^, &x{f{x)) 
denote the median value of / : A — > i? when x ranges over X: 

Med(/(a;)) Inf |a G K : Pr [f{x) > a] < 1/2 

xGX I xGX 

For every integer a; > 1, let kx define the number: 

fc,, Min G N : 2'^a: > . 
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Theorem 2 Let 0 < S < 1/96 and 9 > 0. If P : ^ M zs such that 

Pr [|P(2'="a; + y) - 2^^P{x) - P{y)\ > 9n] < 6, 

then the linear function I : — > M, which is defined by 

l{n) Med(P(n + y) — P{y)), 
v^'dL 

satisfies 

Pr [|P(a:) — Z(a;)| > 320x] < 16<5. 

The proof of this theorem goes in two parts: the robustness (Theorem EJ , and 
the stability (TheoremEJ. Let us give the intuition for this test. When x > n/2, 
i.e. X is large^ the test looks like the standard linearity test. But when x < n/2, 
i.e. a; is small, we add a dilation term which amplifies small errors. 

2.1 Robustness 

This part consists in constructing, using P, a function g which is not linear, but 
approximately linear for large inputs, and perfectly homothetic for small inputs. 
In a sense g approximately corrects the program P. 

The following theorem sates the existence of such a function g. The definition 
of g is based on the probability test and it consists in performing, for some 
X G the median of votes {P{2^’‘x + y) — P{y))/2^^ for all y G the 

probability that P fails the test is small, then g satisfies the following theorem. 

Theorem 3 (Robustness) LetO < d < 1/96 and 9 > 0. If P : ^ W is 

such that 

Pr [|P(2'=-a; + y) - 2^-P{x) - P{y)\ > 9n] < 5, 
yen’ll 

then the function g : P^n ® which is defined by 

g{x) = Med(P(2'=-x + y)~ P{y))/2^^, 

v^'dL 



satisfies 



Pr l\P{x) 

xG-Vf 


— g{x)\ > 29x\ < 16(5, 


(1) 


Wx,y G {n/2,... ,n} , 


\g{x + y)~ g{x) - g{y)\ < 69n, 


(2) 


Wx G P+, 


g{x)=g{2^-x)/2'^-. 


(3) 
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Proof. The proof uses standard techniques developed in [fjLR93l IEKR96I 
IK MS99| . Let us observe first that the function g satisfies 

( Med(P(a; + y) — P{y)), if a; > n/2, 
g(x) = < V&'DL 

[ g{2'^^x)/2^^ , otherwise, 

and therefore g satisfies ©. Before proving that g also satisfies m and ©I we 
state a useful fact called the halving principle IKMS99I . 

Fact 1 (Halving principle) Let 17 and S denote finite sets such that S' C 17, 
and let if be a boolean function defined over 17. Then for uniform probabilities, 

Pr[.AMl<|§|Pr[.AWl. 

First we show that g is close to P as defined in O- To simplify notation, let 
Px,y = P(2^^x + y) — P{y) — 2^^P{x). By definition of g we get 



Pr l\g{x) — P{x)\ > 26x] 
x^T>t 



Pr I Med(P„,y)| > . 

X^T>^ y^T^2n 



Notice that Markov’s inequality gives a bound on the RHS of this equality: 



Pr 

X^T>n 



Med(P,,j^)| > 292’^^ X 



< 2 



Pr 

x^T>^ 



[\P^J > 292’^-x\ 



But 2^^x > n/2, for all x G then using the halving principle we get 



\Ptn? 



Pr [|P,,j,| > 292’^-x\ < 

xGT>n ,yGT)2. 



Pr [\Px,y\ > On] . 



Therefore g satisfies ©. 

Now we prove that g satisfies ©• First we show that for all c G {n/2 , . . . , 2n} 
the median value g{c) is close to any vote {P(c+y) — P{y)) with high probability: 

Pr [\g{c) - (P(c + y)~ P(y))\ > 29n] < 16S. (4) 

Note that kc = 0, therefore Markov’s inequality implies 



Pr [\g{c)-{P{c + y)-P{y))\>29n]<2 Pr [|P,+,,, - P,+,.,| > 20n] • 

Then one can get inequality 0 ) using the union bound and the halving principle. 

Now let a and b be two integers such that ^ < a,b < n. Let c take on the 
values a, b and a -I- 6 in and apply the halving principle to obtain: 

Pr [\g{a + b) - (P{a + b + y) - P(y))\ > 29n] ') 
yev+ 

Pr [\g{a) - (P(a + y) - P{y))\ > 29n] I < 32 ^, 
y&TiZ 

Pr [| 5 (fe) - {P{b + (a -k y)) - P(a -k y))| > 29n] 

yev+ 

Therefore with probability at least 1 — 96i5 > 0 there exists y G T>/i for which 
none of these inequalities are satisfied. Pick such a y to obtain inequality 0 . □ 
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2.2 Stability 

In this section we prove that every function g satisfying the conditions of Theo- 
rem 0 is close to a perfectly linear one. 

Theorem 4 (Stability) Let 9' >0. If g : 'D^n — *■ K such that 

\/x,y e {n/2, ... ,n}, \g{x + y) - g{x) - g{y)\ < 9'n, 
and VxG'D^, g{x) = g{2 ^'° x) , 

then the linear function I : — > IR, which is defined by l{n) g{n), satisfies, 

for all x € 

\g{x) — Z(a;)| < 59'x. 

Proof. Here we borrow a technique developed in |KMS99j that we apply 
to the function g where it is approximately linear. First we extend g re- 
stricted to {n/2,...,n} to a function h defined over the whole semi-group 
{a; G N : a; > n/2}. The extension h is defined for all x > n/2 by 

,, \ def J g{x) if n/2 < x < n, 

^ \ h(x — n/2) + g(n)/2 otherwise. 

One can verify that h satisfies the following doubling property, for all a; > j, 

\h{2x) — 2h{x)\ < 59'n/2. 

Then we apply a result due to which is based on some techniques 

developed in mm . 

Lemma 1 ( [KMS99L Lemma 3]) Let El be a semi-group and E 2 a Banach 
space. Let e > 0 and h \ E\ ^ E 2 be a mapping such that for all x G Ei 

||ft.(2a;) — 2h{x)\\ < e. 

If f : El E 2 is such that f{x) = lim^— »oo /i(2"‘a;)/2™ is a well defined map- 
ping, then for all x G Ei 

\\h{x) - f{x)\\ < e. 

Let / be this function. Then by definition of h we get that f{x) = xg(n)/n, for 
all X > n/2, therefore / is linear and f — 1. We conclude the proof by recalling 
that g equals h on {n/2, . . . , n|, and when 1 < a: < n/2, g{x) = g{f2^^x)/2f‘^ . □ 

3 Testing with Relative Error 

In this section we show how our results lead to approximate self-testers with 
linear error terms and with relative error. First let us define the relative distance. 
Let iF be a collection of real functions defined over a finite domain D. For a real 
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0 > 0 and functions P, f : D —>■ W, we will define the 6 -relative distance between 
P and / on by 

0-rdisti)(P,/) Pr [|P(a;) - /(a;)| > 6>|/(a;)|] , 

x£D 

and the 0-relative distance between P and P on Z? by 

0-rdisti,(P,P) Inf 0-rdistz)(P,/). 
f 

Note that the relative distance between P and / is not symmetric in gen- 
eral. For example if 0 > 1, P{x) = 0, and f{x) = 0, for all x G P+, then 
0-rdist^+(P, /) = 0 and 0-rdist^+(/, P) = 1. We will also need another dis- 
tance which is symmetric but not relative. It is defined for any non negative 
error term /3 : ZD — > IR_|_ . The (3-distance between P and / on P is 

/3-disti)(P,/) Pr [|P(a;) - f{x)\ > (3{x)] , 

xGD 

and the /3-distance between P and P on P is 

/3-distfl(P,P) Inf /3-dist,5(C/). 



First we define the approximate self-tester for /3-distD using the definition 
of fKMS99j which generalizes that of Inr T?+c 



Definition 1 Let G [0,1], P 2 C Di, and J- he a collection of real-valued 
functions defined over D\. Let (3\ and (32 be non negative real-valued functions 
also defined over Di. A (Pi, /3i, i5i; P 2 , / 32 , < 52 )-self-tester /or P is a probabilistic 
oracle program T such that for any program P : D\ — > IR.' 

— // /3i -distu,^ (P, P) < 5i then P+ outputs PASS with probability at least 2/30 

— //’/ 32 -dist£) 2 (P, P) > <52 then P+ outputs FAIL with probability at least 2/30 



Now we extend this definition for relative distance. 



Definition 2 Let 61,62 G [0,1], P 2 C Pi, and P be a collection of real- 
valued functions defined over D\. Let 0i and 62 be non negative reals. A 
(Pi, 01, <5i; P 2 , 02, < 52 ) -self-tester with relative error for P is a probabilistic or- 
acle program T such that for any program P : D\ —f IR.' 

— //0i-rdist£),^(P, P) < <5i then T+ outputs PASS with probability at least 2/3. 

— Z/02-rdist£>2(P, P) > 62 then T+ outputs FAIL with probability at least 2/3. 

^ One can also want this probability to be greater than any confidence parameter 
7 G (0, 1). Here we simplify our discussion by fixing this parameter to 2/3. 

^ Same remark. 



310 



Frederic Magniez 



Usually one would like a self-tester to be different and simpler than any 
correct program. For example we can ask the self-tester to satisfy the little-oh 
property EM, *.e. its running time have to be asymptotically less than that 
of any known correct program. This property could be too restrictive for family 
testing. Here we simplify this condition. If T is a self-tester for d-linearity over 
T>^ then T is required to use only comparisons, additions, and multiplications 
by powers of 2 (i.e. left or right shifts in binary representation). Moreover the 
number of queries and operations of T has to be independent of n. 

A direct consequence of Theorem 0is the existence of a self-tester for the set 
of linear functions, denoted by C, where the distance is defined for a linear error 
term. 

Theorem 5 Let 0 < J < 1/144 be a real, 6 >0 a power of 2, and /3{x) = 9x, for 
all X. Then there exists a /3/16, <5/12; 32/3, 2-i5)-self-tester for the set C. 

Moreover it makes 0{l/6) queries to the program, and uses 0{1/S) comparisons, 
additions, and multiplications by powers of 2. 

Proof. Let > 1 be an integer whose value will be fixed later. The self-tester T 
performs N independent rounds. Each round consists in performing the following 
experiment, where 6* : 

Experiment linearity-test(P, 0) 

1. Randomly choose x,y & T)tn- 

2. Checkif |P(2'=-a:-k?/)-2'=-P(a;)-P(?/)| < 0. 

A round fails if the inequality is not satisfied. Then T outputs FAIL if more than 
a 6 fraction of the rounds fail, and PASS otherwise. 

Let us define the failure probability of P in each round by 

err(P) Pr [\P{2'=- x + y) - 2'^- P{x) - P{y)\ > 9n] . 

First suppose (/3/16)-dist.p+ (P, C) < 5/12. The halving principle and simple 
manipulations lead to err(P) < 5/2. Then a standard Chernoff bound argument 
yields that if = 12(1/5) then outputs PASS, with probability at least 2/3. 

Now if (32/3)-dist.p+ (P, C) > 245, then, since 35/2 < 1/96, the contraposition 
of Theorem El implies err(P) > 35/2. Again, by a Chernoff bound argument if 
N = 12(1/5) then outputs FAIL, with probability at least 2/3. □ 

The previous self-tester has two main disadvantages. First, the error term is 
linear but not relative. Second, it needs to test the program on a bigger domain. 
The following theorem gets around these two problems. 

Theorem 6 Let 0 < 5 < 1/144 be a real and 0 < 0 < 16 o power of 2. Then 
there exists a {T>/^ ,9/64, S/12;'D/^, 329, 245)-self-tester with relative error for the 
set C. Moreover it makes 0(1/5) queries to the program, and uses 0(1/5) com- 
parisons, additions, and multiplications by powers of 2. 

Proof. Now the self-tester T performs N = 0(1/5) times the following experi- 
ment: 
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Experiment linearity-relative-test(P, 9) 

1. Randomly choose y S P+. 

2. Compute G = P{n — y) + P(y) (fix P(0) = 0). 

3. Compute 0 = 9\G\. 

4. Do Experiment linearity-test(extension(P, G),6>). 

The function extension is easily computable using P, and it is defined by: 

Function extension(P, G)(x) 

1. val = 0. 

2. While x>ndox = x — n and val = val + G. 

3. Return {val + P{x)). 

Again, T outputs FAIL if more than a 6 fraction of the rounds fail, and PASS 
otherwise. 

Fix 0 0\G\ and P{x) 9\G\x/n, for all x. Let P extension(P, G), 

and denote the failure probability of one experiment by rerr(P). 

First, suppose there exists a linear function I such that 

(0/64)-rdist.p+(P, Z) < 8/12. Therefore Pr .p+[|P(n — y) + P{y) — Z(n)| > 
9\l{n)\/32] < (5/6. So |G — l{n)\ < 9\l{n)\/32 with probability at least 
1 — (5/6. Suppose this last inequality is satisfied. Then one can verify 
that (0/32)-rdist.p+ (P, Z) < (5/12. Since 0/32 < 1/2, we also obtain 
|Z(n)| < 2|G|. Thus the combination of the two last inequalities gives 

(/5/16)-dist.p+ (P, Z) < (5/12. In this case we previously proved that the 
failure probability of linearity-test(extension(P, G),0) is at most 8/2. In 
conclusion, if (0/64)-rdist.p+(P, Z) < 8/12 then rerr(P) < i5/6 -F 8/2 = 2(5/3. 
Suppose now that (320)-rdist.p+ (P, £) > 245. Then, for all real G, 

(32/3)-distp+(P, Z) > 245, where Z S £ is defined by Z(n) G. But P(n + y) = 
G -F P(y), so Med^gp+ (P(n + y) — P(y)) = G. Therefore the contraposition of 
Theorem El implies that, for all real G, rerr(P) > 35/2. 

To conclude the proof, apply a Chernoff bound argument. □ 

Now we can state our final result which extends the previous one to multi- 
linear functions. It is quite surprising since it does not use multiplications but 
only comparisons, additions, and multiplications by powers of 2. 

Theorem 7 Let d > 1 be an integer. Let 0 < 5 < 1 Z)e o real and 0 < 9 < 
0{l/d^) a power of 2. Then there exists a 0, 5; O(cZ)0, 0(cZ)5)- 

self-tester with relative error for the set of real-valued d-linear funetions defined 
on {'Dfi)‘^. Moreover it makes 0(1/5) queries to the program, and uses 0(1/5) 
eomparisons, additions, and multiplications by powers of 2. 

Proof (sketch). We use some techniques from IFHS94! where a similar result for 
multi- variate polynomials in the context of exact computation was proven. Fact El 
and Lemma El lower and upper bound the distance between a cZ- variate function 
and cZ-linear functions by its successive distances from functions which are linear 
in one of their variables. Then, we estimate the latter quantity by repeating Ex- 
periment linearity-relative-test. More precisely, the self-tester T will repeat 
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0{l/5) times the following experiment. Then T outputs FAIL if more than a 5 
fraction of the rounds fail, and PASS otherwise. 



Experiment d-linearity- relative-test (P, 0) 

1. Randomly choose z & (V^Y. 

2. Randomly choose i G {!,... , d}. 

3. Do Experiment linearity-relative-test(PI, 0). 



The notation Pi denotes the function which takes at t the value 
P(zi, . . . . . . ,Zd). 

Using Fact 121 and Lemma |21 one can conclude the proof using previous meth- 
ods. □ 

The bounds involved in the previous proof are explicitly stated in the follow- 
ing, where denote the set of d- linear functions defined over and Lf 

the set of functions defined over which are linear in their z-th variable. 

First let us state the easy one. 

Fact 2 Let 9 > 0 be a real. Then for all f : > IR 

1 . ^ ' 

-^6<-rdist(p+)4/,/:f) < 6»-rdist(p+)4/,£‘^). 

i=l 

The other bound is more difficult and it can be proven by induction on d. 
Due to lack of place we omit the proof. 

Lemma 2 Let 0 < 9 < l/(16d^) be a real. Then for all f : R 

d 

(4d6»)-rdist(.p+)4/,/:‘^) < 2 ^ 6»-rdistp+)a(/, £f). 

2=1 



Open Questions 

In this paper we achieve the goal of approximate self-testing with relative error 
for multi-linear functions. We would like to extend this work for polynomials. 
More generally when we have no information a priori on the size of the function 
to be computed, constructing approximate self-testers with relative error is an 
interesting challenge. 
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Abstract. In this note we first formalize the notion of hard tautologies 
using a nondeterministic generalization of instance complexity. We then 
show, under reasonable complexity-theoretic assumptions, that there are 
infinitely many propositional tautologies that are hard to prove in any 
sound propositional proof system. 



1 Introduction 

In their seminal paper Cook and Reckhow first formalized the study of lengths 
of proofs in propositional proof systems. They showed that NP = coNP if and 
only if there is a sound and complete propositional proof system S and polyno- 
mial p such that all tautologies F (i.e. all F G TAUT) have proofs in S of size 
p{\F\). The Cook- Reckhow paper led to an extensive and fruitful study of lengths 
of proofs for standard proof systems of increasing complexity and strength. A 
recent survey of the area by Beame and Pitassi is contained in j2j. A restatement 
of the Cook-Reckhow result is that if NP yf coNP then for every sound propo- 
sitional proof system P and polynomial bound p there is an infinite collection 
of “hard” tautologies: i.e. the shortest proofs in P of each such tautology F is 
of length more than p(|F|). A stronger notion of hard tautologies for a proof 
system is defined by Krajicek |3 Definition 14.2.1]. He defines a sequence of 
tautologies {pn\ to be hard for a proof system P if for every n the formula ipn 
can be computed from 1" in polynomial time, has size at least n, and there is 
no polynomial p so that p{\pri\) bounds the size of the proof of ipn in P for all n. 
Under a complexity assumption (slightly stronger than NP yf coNP) it is shown 
by Krajicek |H1 Theorem 14.2.3] that for any non-optimal proof system P there 
is a sequence of hard tautologies for P. A more difficult question is whether there 
is a family of tautologies that is hard in the above sense for every sound proof 
system. Krajicek’s construction does not yield this as the degree of the polyno- 
mial bounding the construction time of hard tautologies depends on the proof 
system and can become arbitrarily large. In a recent result Riis and Sitharam in 
im show under the assumption NEXP yf coNEXP how to construct a sequence 
of tautologies {4>n}n>o that do not have polynomial-size proofs in any proposi- 
tional proof system. The property of {4>n}n>o here is that the sequence is, in a 
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formal logic sense, uniformly generated. Notice that this result is really about 
the hardness of the entire sequence and not about the individual tautologies 
4>n- Indeed, the sequence {4>n}n>o might have an infinite subsequence that have 
polynomial-size proofs in some proof system. 

Schoning dni approaches the question of hard tautologies using the notion 
of complexity cores first defined in He shows in ^21 that if NP ^ coNP 
then there exist a constant e and a collection T of tautologies of density at 
least 2*^" i.o. such that for every sound proof system for TAUT and for every 
polynomial p, the shortest proof of F has length more than p(|F|) for all but 
finitely many F G F. (The density part of this result is a consequence of the 
fact that TAUT has a linear-time padding function.) Again, notice here that F 
could have individual tautologies that are easy for specific proof systems. 

A natural question concerning hard tautologies that arises, which we address 
in this paper, is whether there exist individual tautologies which are “hard” for 
each sound proof system. Notice that none of the results mentioned above assert 
the existence of an infinite collection of tautologies in which every individual 
tautology is hard for every sound proof system. First, what do we mean by 
a tautology that is hard for all sound proof systems? We have to handle the 
following difficulty in the definition. For any F € TAUT, we can always define a 
sound proof system S that includes F as an axiom. Clearly, F will have a trivial 
proof in the proof system S. The key point here is that F has to be somehow 
encoded into the proof system S, implying that the size of the description of S 
needs to be taken into account. We present a formal definition of hard tautology 
for a proof system based on the notion of instance complexity [Zj. Intuitively, 
given a polynomial bound p for proof length, a tautology F is hard if for any 
given proof system P, either the shortest proof of F" in P has size greater than 
p(|F|) or else the size of P (considered as a string) is big enough to contain an 
encoding of F. We formalize the definition of hard tautology using the concepts 
of nondeterministic instance complexity and Kolmogorov complexity. We then 
prove, under reasonable complexity-theoretic assumptions, the existence of an 
infinite set of tautologies with the property that every single tautology in the set 
is hard (with respect to the discussed hardness notions) for every sound proof 
system. 

2 Definitions 

We fix the alphabet as S = {0,1}. Let SAT denote the language of satisfiable 
propositional formulas suitably encoded as strings in E*. Similarly, let TAUT 
denote the language of propositional tautologies. For the definitions of standard 
complexity classes like P, NP, coNP, NEXP, the polynomial hierarchy PH etc. 
we refer the reader to a standard textbook, e.g. m The class 02 (cf. jHl) - also 
described as P’^^[0(log n)] - consists of all sets decidable by a polynomial-time 
oracle machine that can make O(logn) queries to an NP oracle. 

We start with a definition of proof systems that is equivalent to the original 
Cook-Reckhow formulation 
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Definition 1 . ^ A (sound and complete) propositional proof system S is de- 
fined to be a polynomial-time predicate S such that for all F , 

F e TAUT 3p : S{F,p). 

In other words, a proof system can be identified with an efficient procedure for 
checking correctness of proofs. Thus, in complexity-theoretic terms we can also 
identify a propositional proof system S with a nondeterministic Turing machine 
M. The soundness and completeness properties of S can be expressed in terms 
of M as follows: 

F S L{M) ^ F G TAUT (soundness) 

F G TAUT ^ F G L(M) (completeness) 

For the rest of the paper we use the characterization of propositional proof 
systems as nondeterministic Turing machines. 

Nondeterministic Instance Complexity The notion of instance complexity 
was introduced by Orponen et al. in 0 as a measure of the complexity of individ- 
ual instances. The motivation in that paper is to study whether the hardness of 
NP-complete problems, particularly SAT, can be pinpointed to individual hard 
instances in some formal sense. 

We adapt the instance complexity idea to a nondeterministic setting that is 
suitable for defining hard individual tautologies. For a set A C E* we say that 
a nondeterministic machine M is A-consistent if L(M) C A. 

Definition 2. For a set A and a time bound t, the t-time-bounded nondeter- 
ministic instance complexity (nic for short) of x w.r.t. A is defined as: 

nic*(x : A) = min{ \M\ \ M is an A-consistent t-time-bounded 

nondeterministic machine, and 
M decides correctly on x }. 

As required for instance complexity 0 and Kolmogorov complexity 0, we 
formalize nic in terms of a fixed universal machine U that executes nondeter- 
ministic programs. For a time bound function t, and two strings q,x G S* , we 
denote by U*(q,x) the result of running in the universal machine U program q 
on input x for t(|a;|) steps. Notice that the nic measure is (up to an additive 
constant) essentially the same w.r.t. efficient universal machines. 

We now turn to the formal definition of a set A having hard instances 
w.r.t. the nic measure. Intuitively, we consider a: to be a hard instance if there 
is no easier way for an A-consistent nondeterministic program to decide x than 
to explicitly encode x into the program. To define this formally, we consider 
a nondeterministic version of Kolmogorov complexity first defined in . More 
precisely, given a time bound t and a string x G E*, 

CND*(x) = min{ \M\ | M is a f-time-bounded nondeterministic 

Turing machine with L(M) = {a;} } 
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is the nondeterministic t-time-bounded decision Kolmogorov complexity of x 
w.r.t. M. In the standard way (see [H]) we can consider the CND measure to 
be defined w.r.t. a fixed universal machine. Notice that the CND measure is a 
nondeterministic generalization of Sipser’s CD measure M- We note in passing 
that there is no difference between the nondeterministic Kolmogorov complexity 
of checking and generating (unlike in the deterministic case where the C 
and CD time-bounded measures appear to be different m)- The CND measure 
gives an immediate upper bound to nondeterministic instance complexity. 

The next proposition will provide an upper bound on the size of a sound 
proof system that for a given tautology has proofs of a certain length. 

Proposition 1. For any language A, and time bound t, there is a constant c > 0 
such that nic*{x : A) < CND*{x) + c, for all x G S* . 

We now define hard instances for a language A w.r.t. the nic measure. 

Definition 3. A language A is said to have hard instances w.r.t. the nic measure 
if for every polynomial t there are a polynomial t' and a constant c such that for 
infinitely many x we have, nic*{x : A) > CND* (a;) — c. 

We observe next that hard instances of A, if any, must necessarily be in A. 
This is in contrast to the deterministic instance complexity definition 0 for 
which hard instances could be in A or A. 

Proposition 2. For any set A there is a constant c such that for all time bounds 
t and for all x ^ A, nic*{x ■. A) < c. 

Proof. It follows from the fact that any nondeterministic machine that accepts 
0 is consistent with A and decides correctly on all instances x ^ A. □ 

Observe that by Proposition d for every tautology F and time bound t, 
there is always a sound proof system P of size bounded by CND*{F) + c for 
some constant c, and in which F has proofs of size at most t{\F\). Ideally we 
would like to obtain a matching lower bound for the nic measure, as stated in 
the following open question. 

Problem 1. If NP ^ coNP, then for every polynomial time bound t there are 
infinitely many tautologies F and a constant c such that nic*{F : TAUT) > 
CND*{F) - c holds? 

Because of the above observation, this result would be best possible. Although 
we cannot match the upper bound we prove somewhat weaker lower bounds using 
less restricted notions of Kolmogorov complexity instead of the CND measure. 
In particular, we consider the standard unbounded Kolmogorov complexity mea- 
sure C P) and a relativized time-bounded Kolmogorov complexity measure C^: 
C^’*{x) is defined to be the t time-bounded Kolmogorov complexity of string x, 
where the universal machine has oracle access to oracle B. We summarize below 
the main results of this paper proved in Section 0 
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— If PH does not collapse to then for every polynomial t and constant c > 0 
there are infinitely many tautologies F such that nic*{F : TAUT) > clog |F|. 

— NP yf coNP if and only if for every polynomial t there exist a constant c > 0 
and infinitely many tautologies F such that nic*{F : TAUT) > C{F) — c. 

— NP yf coNP if and only if for every polynomial t there exist a constant 

c > 0, a polynomial t' , and infinitely many tautologies F such that nic*{F : 
TAUT) > (7'5-A'r2,t where SAT 2 denotes all true quantified boolean 

formulas with 3V as quantifier prefix. 

Each of the above results can be interpreted as stating that, under a reason- 
able complexity-theoretic assumption, there exist infinitely many tautologies F 
such that any sound proof system S in which F has short proofs must have large 
description (the description size is some function of |F|). 

Notice that the latter two results are weaker than the ideal possible result 
mentioned above because for all F and all polynomials t we have: C{F) < 
C^^'^^’^{F) < CND*{F). Also, observe that the first result proving an 0(log |E|) 
lower bound on the nondeterministic instance complexity of infinitely many tau- 
tologies F is incomparable to the other results (and also to the statement in 
Problem Pi because it does not refer to the Kolmogorov complexity of F. 



Hard Instances and NP Cores We review NP cores in TAUT defined by 
Schoning m and briefly compare it with nondeterministic instance complexity. 
An NP core in TAUT is an infinite subset C C TAUT such that for all polyno- 
mials p and for all proof systems S, at most finitely many F G C have proofs 
in system S of length bounded by p(|E|). In ^^1 it is shown that NP yf coNP if 
and only if TAUT has an NP core (in fact, of density 2’^"', for a constant fraction 
e > 0). 

Theorem 1. Assuming NP y^ coNP, there exist a eonstant e and a collee- 
tion T of tautologies of density at least 2'^" i.o. such that for every sound proof 
system for TAUT and for every polynomial p, the shortest proof of F has length 
more than p(|E|) for all but finitely many F G F. 

The above theorem does not really talk about tautologies that are hard for 
each proof system. We can only make the following easy connection to the non- 
deterministic instance complexity of tautologies in an NP core of TAUT. 

Proposition 3. A set C is an NP core of a recursive set A (and hence also 
TAUT) if and only if for every polynomial t and constant c, nic*{x : A) > c for 
all but finitely many x G C . 



3 The Results 

We prove two theorems which assert, in different ways, that there exist infinitely 
many hard tautologies assuming that the polynomial hierarchy does not collapse. 
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Theorem 2. If PH ^ &2 then given any polynomial t and constant c, there 
exist infinitely many tautologies F such that nic*{F : TAUT) > clog \F\. 

Proof. Assume the contrary: Suppose there are a polynomial t and constant c 
such that for all F G TAUT : nic*{F : TAUT) < clog |F|. In particular, for any 
n we have 

V F e TAUT-'^ : nic\F : TAUT) < clogn. 

It suffices to show under this assumption that Let M be an oracle- 

NP machine with TAUT as oracle. With no loss of generality, we can assume 
that on each computation path M makes precisely one positive oracle query to 
TAUT and that too at the end of the path before it decides on acceptance. Let 
p denote a polynomial such that the running time of M on input x is bounded 
by p(|a:|), for all x. We will design a machine N that accepts L{M). Let x 
be a length m input to M. Let n denote p{m) which bounds the size of TAUT 
queries that M{x) can make. Given the O(logn) bound on the nondeterministic 
instance complexity of tautologies in TAUT-'^, we only have to use the TAUT- 
consistent nondeterministic programs in the set whose size is bounded 

by some polynomial in n. But first we must get rid of the TAUT-inconsistent 
nondeterministic programs in We show that the inconsistent programs 

can be bundled out in an NP set, and by a census argument that can be carried 
out with a 02 computation we can exactly count the TAUT-inconsistent nonde- 
terministic programs in After that a single NP computation can guess 

the TA UT-consistent subset of ^nd use them to replace the TAUT 

oracle. 

BA0:= {(p,0") |pe : 3 F G - TAUT=^ : U* (p, F) accepts } 

Notice that BAD is in NP. In fact, BAD is a sparse NP set: ||{p G jj<ciogn | 
(p, 0") G BAD}\\ — 0{n^). Let BADn denote the set {p G | (p,0") G 

BAD}. We can use a standard census argument like the one in jOj to determine 
||i?AF„|| with a 02 computation. The 0f machine N will first compute ||i?AF„||. 
Now it is easy to simulate {x) with an NP computation: First guess the 

II FAF„ II strings in BADn and verify with an NP computation that (p, 0”) G BAD 
for each guessed p. Having guessed and verified BAD^ the remaining strings in 
yn<ciogra TA FF-consistent programs. The NP computation now proceeds 

to simulate When the simulation encounters a positive TAUT query 

F, it accepts if and only if 

3 p G _ BADn) ■■ U\p,F) accepts. 

To summarize, the 02 machine N on input x first computes ||FAF„|| with a 02 
computation. What remains to simulate is just an NP computation. 

Hence PH collapses to 0f . □ 

We now turn to the next result which is closer in spirit to the definition of 
hard-to-prove tautologies. 
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Theorem 3. NP ^ coNP if and only if for every polynomial t there exist a 
constant c > 0 and infinitely many tautologies F such that nic*{F : TAUT) > 
C(F) - c. 

Proof. We first prove the forward implication. Our proof is an easy modification 
of the Fortnow-Kummer proof technique in 0 for the instance complexity conjec- 
ture 0 for NP-hard sets under honest reductions (this technique is also used by 
Mundhenk CD to prove a weaker form of the general instance complexity conjec- 
ture of j2j). The idea is to build a deterministic Turing machine M that on input 
0" searches for and outputs a tautology F such that nic*{F : TAUT) > cn for a 
constant c. The fact that M outputs F on input 0" implies that C{F) < logn, 
proving the theorem. The proof of correctness argues that if M never halts on 
some input 0" then, in fact, TAUT G NP contradicting the assumption. We give 
the formal details. 

Fix n G Af. Let /„ ;= {q G E* \ |g| < 2n} be the candidate nondeterministic 
programs. We describe the machine M: 

1. input 0"; m := 0; 

2. In ■■= {q G E* \ |g| < 2n}; m := 0; 

3. Stage m: 

(a) Spend m steps in computing an initial segment a of TAUT using brute- 
force search; 

(b) Spend m steps in simulating U^{q,x) for q G In and where a; G cr is 
picked 

in lexicographic order; 

(c) If 9 is found incompatible with a then eliminate q from In', 

(d) Do a prefix search for F G E'^ with the following property 

F G TAUT~"^ : all programs in /„ reject F 

(e) If the prefix search finds such an F then output F and stop. Else goto 
Stage m+ 1. 

Claim. TAUT G NP if M(0”) does not terminate for some n. 

To see this, suppose M(0”) does not terminate. Steps (a), (b), and (c) en- 
sure that after some stage m = mg, all TlAf/T'-inconsistent programs in are 
eliminated. Let / denote the remaining TAUT-consistent programs. Now, if the 
prefix search in Step (d) fails for each m > toq it follows that for all m > toq 

F G TAUT U*{q,F) accepts for some q G I 
which implies that TAUT G NP. 

By assumption, therefore, M{0'^) halts for each n. Let Fn be the formula 
output by the computation of M{0'^). The prefix search ensures that nic*{Fn : 
TAUT) > 2n. The fact that M(0") has output Fn implies that C{Fn) < logn. 
Thus nic*{Fn : TAUT) > C{Fn) — c, for a suitable constant c holds for all n, 
which completes the proof. 
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To see the reverse implication, notice that if NP = coNP then for some 
polynomial time bound t and some constant c we have nic*{F : TAUT) < c. 
Now, since C{F) > c for almost all F, it follows that nic^{F : TAUT) < C{F) 
for almost all □ 

Notice in the above proof that instead of C{F) we can use polynomial-time 
bounded Kolmogorov complexity with a Iff oracle. This is due to the fact that 
with a Iff oracle the machine M can do the prefix search for F in polynomial 
time. Thus, we also have the following (stronger) result. Let SAT 2 denote the 
Iff -complete language consisting of all 3V quantified boolean formulas that are 
true. 

Theorem 4. NP ^ coNP if and only if for every polynomial t there exist a 
constant c > 0, a polynomial t' , and infinitely many tautologies F such that 
nic\F : TAUT) > {F) - c. 

4 Discussion 

The interesting open question is to improve Theorems El and E] by proving that 
if NP coNP then for every polynomial t there are infinitely many tautologies 
F, a polynomial t' and a constant c such that ni<f{F : TAUT) > CND* (F) — c 
holds. The difficulty in adapting the Fortnow-Kummer proof technique to prove 
this is in carrying out the prefix search in Step 3(d) of the algorithm in the proof 
of Theorem 01 A sufficiently powerful oracle is required to carry out the prefix 
search for F G TAUT~^ in polynomial time. More accurately, the prefix search 
requires a leading existential quantifier followed by a universal quantifier (both 
for ensuring that F G TAUT~'^ and checking that all nondeterministic programs 
in In reject F). Thus, we require a Iff oracle for the task; it is not clear if an 
NP oracle suffices. 

This brings us to a related issue concerning the deterministic instance com- 
plexity result of jS]: In that paper a similar prefix search as mentioned above 
is carried out. Here the prefix search is for a hard instance F G If" (not nec- 
essarily in SAT because, unlike for the nic measure, hard ic instances can be 
in SAT or SAT). At first sight it appears that an NP oracle is required for the 
prefix search, but Fortnow and Kummer |S| apply the clever trick of using the 
programs in the set I to simulate this NP oracle and interpreting the program 
answers in some suitable way. However, the result of the prefix search could be a 
formula in SAT or SAT, and there is no way of guaranteeing it to be in SAT, for 
instance. Thus, the proof in Ej cannot guarantee that there are infinitely many 
hard instances in SAT (although it certainly follows from their proof that either 
SAT or SAT has infinitely many hard instances). Thus, the following question 
concerning deterministic instance complexity of SAT remains open. 

Problem 2. If P NP, then for each polynomial t there are a polynomial t', a 
constant c, and infinitely many F G SAT, such that ic^(F : SAT) > C* (F) — cl 
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However, guided by an NP oracle, a prefix search for a hard instance F G 
SAT~^ can be carried out in polynomial time (the additional Fortnow-Kummer 
trick 0 of replacing the NP oracle with the set of remaining programs does not 
appear possible if we need to search for F G SAT). This gives us the following 
result that can be proved on the same lines as Theorems 0 and 0 

Theorem 5. P yf NP if and only if for each polynomial t there are polynomial 
t' , a constant c, and infinitely many F G SAT , such that ic*{F : SAT) > 

CSAT,t'(^F) _c. 

The above result is very similar to m Theorem 5] which proves a weaker 
form of the instance complexity conjecture of |3- 

In order to put the above result in better perspective we define a one-sided 
version of deterministic instance complexity. For A C E* a deterministic machine 
M is one-sided A-consistent if L{M) C A. 

Definition 4. For A C E* and a time bound t, the t -time-bounded one-sided 
deterministic instance complexity (ic+ for short) of x w.r.t. A is defined as: 

zc(|_(a; : A) = min{ \M\ \ M is a one-sided A-consistent t-time-bounded 

deterministic Turing machine 
and M decides correctly on x }. 

As usual, is defined w.r.t. a fixed universal machine. The one-sidedness 
of the definition implies, as in the case of nic, that hard instances w.r.t. the «c+ 
measure can only be in A. It follows from the fact that any deterministic machine 
that accepts 0 is consistent with A and decides correctly on all instances x ^ A. 

Proposition 4. For any set A there is a constant c such that for all time bounds 
t and and for all x ^ A, ic*^_{x : A) < c. 

We say A has hard instances w.r.t. the «c+ measure if for each polynomial t 
there are a constant c and polynomial t' such that for infinitely many x, ic'f (x : 
A) > C* {x) — c. The following result can also be proved exactly as Theorem 0 

Theorem 6. P NP if and only if for each polynomial t there are a polyno- 
mial t' , constant c, and infinitely many F G SAT, such that ic*^_{F : SAT) > 

CSAT.t'^p-^ 
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Abstract. This paper investigates the instance complexities of prob- 
lems that are hard or weakly hard for exponential time under polynomial 
time, many-one reductions. It is shown that almost every instance of al- 
most every problem in exponential time has essentially maximal instance 
complexity. It follows that every weakly hard problem has a dense set of 
such maximally hard instances. This extends the theorem, due to Orpo- 
nen, Ko, Schoning and Watanabe (1994), that every hard problem for 
exponential time has a dense set of maximally hard instances. Comple- 
menting this, it is shown that every hard problem for exponential time 
also has a dense set of unusually easy instances. 



1 Introduction 

A problem that is computationally intractable in the worst case may or may 
not be intractable in the average case. In applications such as cryptography 
and derandomization, where intractability is a valuable resource, worst-case in- 
tractability seldom suffices, average-case intractability often suffices, and almost- 
everywhere intractability is sometimes required. Implicit in these distinctions is 
the truism that some instances of a computational problem may be hard while 
others are easy. 

The complexity of an individual instance of a problem cannot be measured 
simply in terms of the running time required to solve that instance, because 
any algorithm for the problem can be modified to solve that instance quickly 
via a look-up table. Orponen, Ko, Schoning, and Watanabe m used ideas from 
algorithmic information theory to circumvent this difficulty, thereby introducing 
a precise formulation of the complexities of individual instances of computational 
problems. 

Given a decision problem A C {0, 1}*, an instance x G {0, 1}*, and a time 
bound t : N ^ N, Orponen, Ko, Schoning, and Watanabe m defined the t- 
time-bounded instance complexity of x relative to A, written ic*{x : A), to be 
the number of bits in the shortest program tt such that tt decides x in at most 
t(|a;|) steps and tt does not decide any string incorrectly for A. (See sections 

* This research was supported in part by National Science Foundation Grant CCR- 
9610461. 
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2 and 3 below for complete definitions of this and other terms used in this 
introduction.) Instance complexity has now been investigated and applied in a 
number of papers, including EH urn H 13 d El EH], and is discussed at some 
length in the text m- 

In this paper we investigate the instance complexities of problems that are 
hard or weakly hard for exponential time under polynomial time, many-one 
reductions. Our most technical results establish the measure-theoretic abundance 
of problems for which almost all instances have essentially maximal instance 
complexities. From these results we derive our main results, which are lower 
bounds on the instance complexities of weakly hard problems, and we separately 
establish upper bounds on the instance complexities of hard problems. We now 
discuss these results in a little more detail. 

The t-time-hounded plain Kolmogorov complexity of a string x, written C*(x), 
is the number of bits in the shortest program tt that describes (i.e., prints) x in at 
most t{\x\"j steps. As observed in jSD], it is easy to see that, for t' modestly larger 
than t, ic* (x : A) cannot be much larger than C*(x), since a description of x con- 
tains all but one bit of the information required for a program to correctly decide 
whether x G A and decline to decide all other strings. An instance x thus has 
essentially maximal t-time-bounded instance complexity if ic*{x : A) is nearly as 
large as C* (x), where t' is modestly larger than t. Orponen, Ko, Schoning, and 
Watanabe established the existence of a problem A G E = DTIME(2*™®^'') 
for which all but finitely many instances x have instance complexities that are es- 
sentially maximal in the sense that ic^ [x ■. A) > C* {x) — 2 log C* {x) — c, where 
c is a constant and t'{n) = cn2^" -|- c. In contrast with this existence result, we 
prove in this paper that almost every language A G E has the property that all 
but finitely many instances x have essentially maximal instance complexities in 
the slightly weaker (but still very strong) sense that ic^ (x : A) > (1 — e)C* (x), 
for any fixed real e > 0, where t'{n) = 2^". We also show that almost every 
A G E 2 = DTIME(2P°^y) has the property that all but finitely many instances 
X satisfy the condition zc^"(x : A) > (x) — C* (x)®, for any fixed real e > 0, 

where t'{n) = 2” . 

Naturally arising problems that are - or are presumed to be - intractable have 
usually turned out to be complete for NP or some natural complexity class con- 
taining NP. The complexities of such problems are thus of greater interest than 
the complexities of arbitrary problems. The instance complexities of problems 
that are complete (or just hard) for NP or exponential time under <J^-reductions 
have consequently been a focus of investigation. 

Regarding problems that are <^-hard for exponential time, Orponen, Ko, 
Schoning and Watanabe m have shown that every such problem H must have 
an exponentially dense set of instances x that are hard in the sense that for 
every polynomial t, ic*{x : H) > C* (x) — 21ogC* (x) — c, where c is a constant 
and t'{n) = cn2^" -|- c. Buhrman and Orponen E| proved a related result stating 
that, if H is actually <J^-complete for exponential time, then H has a dense set 
of instances x that are hard in the sense that for every polynomial t{n) > n^, 
zc*(x : H) > C‘(x) — c, where c is a constant. 
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The main results of this paper show that this phenomenon - a dense set 
of instances whose complexities are essentially maximal - holds not only for 
<J^-hard problems for exponential time, but in fact for all weakly <J^-hard 
problems for exponential time (with slight technical modifications in the instance 
complexity bounds). This is a significant extension of the earlier work because 
Ambos-Spies, Terwijn and Zheng ^ have shown that almost every problem in 
E is weakly <^-hard, but not <^-hard, for E, and similarly for E 2 . 

To be precise, we prove that for every weakly <J^-hard language H for E 2 
and every e > 0 there exists (5 > 0 such that the set of all instances x with 

4„ 

ic^ (x : H) > (1 — e)C^ (cc) is dense, as is the set of all x for which ic^ (x : 

2n^ 2 Ti ^ 

iL) > (x) — C'^ {xY- Since Juedes and Lutz |2j have shown that every 

language that is weakly <^-hard for E is weakly <)(j-hard for E 2 (but not 
conversely, even for languages in E), our results hold a fortiori for problems 
that are weakly <)(j-hard for E. 

Regarding problems that are NP-complete ( of which we take SAT to be the 
canonical example), any nontrivial lower bound on instance complexity must be 
derived from some unproven hypothesis (or entail a proof that P Y ^P) because 
languages in P have bounded instance complexities m- Assuming P Y 
Orponen, Ko, Schoning and Watanabe m showed that for every polynomial 
t and constant c, the set {x\iY{x : SAT) > clog|a;|} is infinite. Assuming the 
hypothesis that nonuniformly secure one-way functions exist (which implies P 
Y NP), Ko Uni proved that this set is nonsparse. Assuming E Y (which 
also implies Py^NP), Orponen, Ko, Schoning and Watanabe showed that 
SAT has an infinite set of instances of essentially maximal complexity in the 
sense that for every polynomial t there exist a polynomial t' , a constant c, and 
infinitely many x such that iY{x : SAT) > C* {SAT) — c. 

The hypothesis that NP does not have p-measure 0, written ^p(NP) Y Oj 
has been proposed by Lutz. This hypothesis has been shown to imply reasonable 
answers to many complexity-theoretic questions not known to be resolvable using 
P Y NP or other “traditional” complexity-theoretic hypotheses. (Such results are 
discussed in the surveys ESI in [30 ) The /Xp(NP) Y 0 hypothesis implies the 
hypothesis Ey^NE EH] and is equivalent to the assertion that NP does not have 
measure 0 in E 2 |2|. Here we note that, if ^p(NP) Y Oj then SAT is weakly 
<|^-hard for E 2 , whence our above-mentioned results imply that SAT has a 
dense set of instances of essentially maximal complexity. That is, if p-p(NP) Y 0i 
then for every e > 0 there exists (5 > 0 such that the set of all x for which 

4n 

i(Y {x : SAT) > (1 — e)C^ (x) is dense, as is the set of all x for which iY (x : 
SAT) >C^""\x)-C^""\xY. 

In the course of this introduction, we have seen that almost every problem 
A in exponential time has both of the following properties. 

1. All but finitely many instances of A have essentially maximal instance com- 
plexity (our abundance results). 

2. A is weakly <)(j-hard for exponential time 
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Thus weakly hard problems can have essentially maximal complexity at almost 
every instance. In contrast, we also show that every problem H that is actually 
<J^-hard for exponential time must have a dense set of instances x that are 
unusually easy in the very strong sense that ic^ (x : H) is bounded above by 
a constant. Our proof of this fact is based largely on the proof by Juedes and 
Lutz jSj of an analogous result for complexity cores. 

In section 3 below we present a complete definition of time-bounded instance 
complexity. Section 4 is the main section of this paper. In this section we prove 
our abundance theorems, derive our lower bounds on the instance complexities 
of weakly hard problems, and note the consequences for the complexity of SAT 
if ^p(NP) yf 0. In section 5 we prove that every hard problem for exponential 
time has a dense set of unusually easy instances. 

Due to space limitations, we have omitted all proofs from our paper in these 
proceedings. An expanded version of this paper, now available at http://www.es.- 
iastate.edu/~lutz/papers.html, includes the proofs of our results, along with fur- 
ther discussion of relevant aspects of Kolmogorov complexity, resource-bounded 
measure, weak completeness, complexity cores, and instance complexity. 

2 Notation and Terminology 

A language A C {0, 1}* is sparse if there exists a polynomial q such that (Vn) | An 
< q{n), and exponentially dense (or, simply, dense) if there exists a 
real number e > 0 such that (V°°n) |A n {0, 1}-"| > 2” . 

Our main results involve resource-bounded measure, which was developed by 
Lutz We refer the interested reader to any of the surveys ITT) 

for discussion of this theory. Recall that a language H is <^-hard for a class C 
of languages if A <^H for all A G C, and <^-complete for C if iL G C and H 
is <^-hard for C. Resource-bounded measure allowed Lutz to generalize these 
notions as follows. (We write Fm{H) = {A|A H}.) 

Definition 2.1. A language H C {0,1}* is weakly <^-hardforFi (respectively, 
for E 2 / i/ p,(Pm(ff)|E) 0 (respectively, /r(Pm(iL)|E 2 ) yf 0/. A language H C 
{0, 1}* is weakly <^-complete for E (respectively, for E 2 / if H G E (respectively, 
H GF 12 ) and H is weakly <^-hard for E (respectively, for E 2 /. 

It is clear that every <{/-hard language for E is weakly <{/-hard for E, and 
similarly for E 2 . 

3 Instance Complexity 

Following I2D1, we define an interpreter to be a deterministic Turing machine with 
a read-only program tape, a read-only input tape, a write-only output tape, and 
an arbitrary number of read/ write work tapes, all with alphabet {0, 1, U}, where 
U is the blank symbol. Given a program tt G {0, 1}* on the program tape and an 
input X G {0,1}* on the input tape, an interpreter Af may eventually halt in an 
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accepting configuration, a rejecting configuration, an undecided configuration, 
or an output configuration, or it may fail to halt. If M halts in an accepting 
configuration, we say that tt accepts x on M, and we write = 1. If M 

halts in a rejecting configuration, we say that tt rejects x on M , and we write 
M{7t,x) = 0. In either of these two cases, we say that tt decides x on M. If M 
halts in an undecided configuration, or if M fails to halt, we say that tt fails to 
deeide x on M, and we write M(tt, x) =_L. If Af halts in an output configuration 
with output y G {0,1}* on the output tape, we write M{t:,x) = y. ( If y is 0 
or 1, the context will always make it clear whether “M(7 t, x) = y” refers to a 
decision or an output.) 

We write timeM{T^ix) for the running time of M with program tt and input 
X. If =_L, we stipulate that timeM{'x,x) = oo. 

A program tt is consistent with a language A C {0, 1}* relative to an inter- 
preter M if for all x G (0, 1}*, M{tt,x) G ||a; G A],_L|, i.e., tt either decides x 
correctly for A or else fails to decide x. 

We now recall the definition of time-bounded instance complexity, which is 
the main topic of this paper. 

Definition 3.1. (Orponen, Ko, Schoning and Watanabe JE^) Let M be an in- 
terpreter, t : N ^ N, A C (0, 1}*, and x G (0, 1}*. The t -time-bounded instance 
complexity of x with respect to A given M is 

ic\j{x : A) = mm||7r| | tt is eonsistent with A relative to M and 

timeM{'^,x) < t(|a;|)}, 



where min 4> = oo. 

Thus ic\f{x : A) is the minimum number of bits required for a program tt 
to decide x correctly for A on M, subject to the constraints that tt is consistent 
with A relative to M and M{tt,x) does not run for more than f(|a:|) steps. 

Note. Our definition of ic\f{x : A) differs from that in [2I3j in that we do 
not require M{-K,y) to halt within t{\y\) steps - or even to halt at all - for 
y ^ X. In our complexity-theoretic setting, with time-constructible functions t, 
this difference is technical and minor (at most a constant number of bits and a 
logt factor in the time bound), and it simplifies some technical results. In other 
settings, such as that of the time-unbounded instance complexity conjecture EH, 
the halting behavior for y ^ x is a, more critical issue. 

4 Hard Instances 

In this section we prove our main results. We show that almost every instance 
of almost every problem in E has essentially maximal instance complexity, and 
similarly for E 2 . Using this, we show that every problem that is weakly <^~ 
hard for either of these classes has an exponentially dense set of such maximally 
hard instances. We begin with our abundance theorem in E. In contrast with 
Theorem 5.11 of Orponen, Ko, Schoning and Watanabe EDI, which asserts the 



Hard Instances of Hard Problems 



329 



existence of a language in E with essentially maximal instance complexity, the 
following result says that almost every language in E has this property, albeit 
with a slightly weaker interpretation of “essentially maximal” . 

Theorem 4.1. For all c G and e > 0, the set 

X(c, e) = {A\(y°°x)i(? {x : A) > {1 — e)C^^ ^ ' (a;)} 

has p-measure 1, hence measure 1 in E. 

Theorem EH has the following analog in E 2 . 

Theorem 4.2. For all c G and e > 0, the set 

„(c+l) „(c+l) 

X 2 (c, e) = {A\{y^x)ic^ {x-.A)> {x) - (a;)^} 

has p 2 -Tneasure 1, hence measure 1 ot E 2 . 

Before proceeding, we note that Theorems EH and E3 imply the following 
known fact, which was proven independently by Juedes and Lutz (as stated) 
and Mayordomo PI (in terms of bi-immunity) . 

Corollary 4.3. (Juedes and Lutz Mayordomo m) Let c G Z"*'. 

1. Almost every language in E has {0, 1}* as a DTLME{2‘^^)-complexity core. 

2. Almost every language in E 2 has {0,l}*as a DTLME{2^ )-complexity core. 

Our next task is to use Theorems 14. 1 1 and 14.21 to prove that every weakly 
<J^-hard language for exponential time has a dense set of very hard instances. 
For this purpose we need a few basic facts about the behavior of polynomial- 
time reductions in connection with time-bounded Kolmogorov complexity, time- 
bounded instance complexity, and density. 

The data processing inequality of classical information theory ^ says that 
the entropy (Shannon information content) of a source cannot be increased by 
performing a deterministic computation on its output. The analogous data pro- 
cessing inequality for plain Kolmogorov complexity says that if / is a com- 
putable function, then C{f{x)), which is the algorithmic information content of 
/(x), cannot exceed C(x), the algorithmic information content of x, by more 
than a constant number of bits. The following lemma is a time-bounded version 
of this fact. It is essentially well-known, though perhaps not in precisely this 
form. 

Lemma 4.4. (data processing inequality) For each f G PE, there exist a poly- 
nomial q and a constant c G N such that for all x G {0, 1}* and all nondecreasing 

\f{x)\ > |x| ^ C*"{f{x)) < C*(x) -h c, 
where t"{n) = ct'{n) log(t'(n)) -I- c and t’{n) = t{n) q{n). 
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Our next lemma is a straightforward extension of Proposition 3.5 of m- 

Lemma 4.5. For each f G PF there exist a polynomial q and a constant c £ N 
such that for all A C {0, 1}*, x £ {0, 1}*, and nondecreasing t : N — > N, 

{x : f~^{A)) < ic\f{x) : A) + c, 

where t"{n) = ct'{n) log(F(n)) + c and t'{n) = q{n) + t{q{n)). 

The following consequence of Lemma [4.51 is especially useful here. 

Corollary 4.6. For each f £ PF there exist 6 > 0 and c £ N such that for all 
but finitely many x £ {0, 1}*, for all A C {0, 1}*, 

(fix) : A) > ic^"{x : f~^(A)) - c. 

Juedes and Lutz 0 introduced the following useful notation. The nonreduced 
image of a language S C {0, 1}* under a function / : {0, 1}* ^ {0, 1}* is the 
language 



f-{S) = {f{x)\x £ S and |/(x)| > |x|}. 



Lemma 4.7. (Juedes and Lutz If f G PF is one-to-one a.e. and S C {0, 1}* 
is cofinite, then f-{S) is dense. 

We now prove that every weakly <J^-hard language for exponential time has 
a dense set of very hard instances. Orponen, Ko, Schoning, and Watanabe mi 
have shown that every <^-hard language for exponential time has a dense set of 
very hard instances, and Buhrman and Orponen ^ have proven a similar result 
with improved time bounds and density for languages that are <|^-complete 
for exponential time. Theorems OI and O below can be regarded as extending 
this phenomenon (with some modification in the precise bounds) to all weakly 
<J^-hard languages for exponential time. 

Juedes and Lutz ^ have proven that every weakly <J))-hard language for E 
is weakly <|))-hard for E 2 , but that the converse fails, even for languages in E. 
We thus state our results in terms of weakly <J^-hard languages for E 2 , noting 
that they hold a fortiori for languages that are weakly <)(j-hard for E. 

Theorem 4.8. If H is weakly <^-hard for E 2 , then for every e > 0 there exists 
S > 0 such that the set 

HT’^H) = {x\ic^"' {x-.H)> {I- (a;)} 



is dense. 

Using Theorem 14. II in place of Theorem 14.21 we prove the following similar 
result. 
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Theorem 4.9. If H is weakly <^-hard for F 12 , then for every e > 0 there exists 
S > 0 such that the set 

HII'\h) = {x:H)> (cc) - (x)'^} 



is dense. 

It can be shown that Theorem 14.91 imnlies (and is much stronger than) the 
following known result. 

Corollary 4.10. (Juedes and Lutz ^). If H is weakly <^-hard for F 12 , then H 
has a dense exponential complexity core. 

Theorems I4.^^l and 14.91 are incomparable in strength because Theorem I4.^^l 
gives a tighter time bound on the plain Kolmogorov complexity, while Theo- 
rem 14.91 gives a tighter bound on the closeness of the time-bounded instance 
complexity to the time-bounded plain Kolmogorov complexity. For most strings 
X, C*{x) and C{x) are both very close to |a;|, so the time bound on C*{x) is 
often of secondary significance. Thus for many purposes, the following simple 
consequence of Theorem 14.91 suffices. 

Corollary 4.11. If H is weakly <^-hard for E or E 2 , then for every e > 0 there 
exists (5 > 0 such that the set 

HEf\H) = {x\ic^"\x : H) > C{x) - C{xf} 



is dense. 

We conclude this section with a discussion of the instance complexities of NP- 
complete problems. For simplicity of exposition we focus on SAT, but the entire 
discussion extends routinely to other NP-complete problems. 

We start with three known facts. The first says that the hypothesis P 7 ^ NP 
implies a lower bound on the instance complexity of SAT . 

Theorem 4.12. (Orponen, Ko, Schoning and Watanabe IfP 7 ^ NP, then 
for every polynomial t and constant c € N, the set 

{x\ic*{x : SAT) > clog |a;|} 



is infinite. 

Each of the next two facts derives a stronger conclusion than Theorem 14. 1 2 l from 
a stronger hypothesis. 

Theorem 4.13. (Ko m) If nonuniformly secure one-way functions exist, then 
for every polynomial t and constant c G N, the set 

{x\ic*{x : SAT) > clog |a;|} 



IS nonsparse. 
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Theorem 4.14. (Orponen, Ko, Schoning and Watanabe NE, then 

for every polynomial t there exist a polynomial t' and a constant c G N such that 
the set 

{x\ic*{x : SAT) > C* {x) — c} 

is infinite. 

The following theorem derives a strong lower bound on the instance com- 
plexity of SAT from the hypothesis that /j,p(NP) yf 0. This hypothesis, which 
was proposed by Lutz, has been proven to have many reasonable consequences 
The jip(NP) ^ 0 hypothesis implies E yf NE IE] and is equivalent 
to the assertion that NP does not have measure 0 in E 2 |2!- Its relationship to 
the hypothesis of Theorem 14. 1 31 is an open question. 

Theorem 4.15. If /rp(NP) yf 0, then for every e > 0 there exists 6 > 0 such 
that the sets 

HT{\SAT) = {x\ic^’'\x : SAT) > (1 - e)C'^""(a;)}, 

HI f {SAT) = {x\ic^’'\x : SAT) > C^""\x) - C‘^""\x)^} 

are dense. 

5 Easy Instances 

In this brief section, we note that languages that are <J^-hard for exponential 
time have instance complexities that are unusually low in the sense that they 
obey an upper bound that is violated by almost every language in exponential 
time. Our proof is based on the following known result. 

Theorem 5.1. (Juedes and Lutz For every <^-hard language H for E, 
there exist B, D G DTIME(2'^") such that D is dense and B — H n D. 

The following theorem gives an upper bound on the instance complexities of 
hard problems for exponential time. It says that every such problem has a dense 
set of (relatively) easy instances. 

Theorem 5.2. For every <^-hard language H for E there is a constant c G N 
such that the set 

EIc{H) = {x\ic^ (x : H) < c} 

is dense. 

By Theorem 14. H almost every language in exponential time violates the up- 
per bound given by Theorem 15.21 Thus these two results together imply the 
known fact 0 that the set of <)(j-hard languages for exponential time has p- 
measure 0. It should also be noted that Ambos-Spies, Terwijn and Zheng 0 
have shown that almost every language in E is weakly <|^-hard for E. It fol- 
lows by Theorem 14. 1 1 that almost every language in E is weakly <^-hard for E 
and violates the instance complexity upper bound given by Theorem 15.21 Thus 
Theorem l5.2l ca,nnot be extended to the weakly <)(j-hard problems for E. 



Hard Instances of Hard Problems 



333 



References 

[1] K. Ambos-Spies and E. Mayordomo. Resource-bounded measure and randomness. 
In A. Sorbi, editor, Complexity, Logic and Recursion Theory, Lecture Notes in 
Pure and Applied Mathematics, pages 1-47. Marcel Dekker, New York, N.Y., 

1997. 

[2] K. Ambos-Spies, S. A. Terwijn, and X. Zheng. Resource bounded randomness and 
weakly complete problems. Theoretical Computer Science, 172:195-207, 1997. 

[3] H. Buhrman and E. Mayordomo. An excursion to the Kolmogorov random strings. 
Journal of Computer and System Sciences, 54:393-399, 1997. 

[4] H. Buhrman and P. Orponen. Random strings make hard instances. Journal of 
Computer and System Sciences, 53:261-266, 1996. 

[5] H. Buhrman and L. Torenvliet. Complete sets and structure in subrecursive 
classes. In Proceedings of Logic Colloguium ’96, pages 45-78. Springer- Verlag, 

1998. 

[6] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & 
Sons, Inc., New York, N.Y., 1991. 

[7] L. Fortnow and M. Kummer. On resource-bounded instance complexity. Theo- 
retical Computer Science, 161:123-140, 1996. 

[8] D. W. Juedes and J. H. Lutz. The complexity and distribution of hard problems. 
SIAM Journal on Computing, 24(2):279-295, 1995. 

[9] D. W. Juedes and J. H. Lutz. Weak completeness in E and E 2 . Theoretical 
Computer Science, 143:149-158, 1995. 

[10] K. Ko. A note on the instance complexity of pseudorandom sets. In Proceedings of 
the Seventh Annual Structure in Complexity Theory Conference, pages 327-337. 
IEEE Comput. Soc. Press, 1992. 

[11] Martin Kummer. On the complexity of random strings. In 13th Annual Sympo- 
sium on Theoretical Aspects of Computer Science, pages 25-36. Springer, 1996. 

[12] M. Li and P. M. B. Vitanyi. An Introduction to Kolmogorov Complexity and its 
Applications. Springer- Verlag, Berlin, 1997. Second Edition. 

[13] J. H. Lutz. Almost everywhere high nonuniform complexity. Journal of Computer 
and System Sciences, 44:220-258, 1992. 

[14] J. H. Lutz. The quantitative structure of exponential time. In L.A. Hemaspaandra 
and A.L. Selman, editors, Complexity Theory Retrospective II, pages 225-254. 
Springer- Verlag, 1997. 

[15] J. H. Lutz. Resource-bounded measure. In Proceedings of the 13th IEEE Confer- 
ence on Computational Complexity, pages 236-248, New York, 1998. IEEE. 

[16] J. H. Lutz and E. Mayordomo. Cook versus Karp-Levin: Separating completeness 
notions if NP is not small. Theoretical Computer Science, 164:141-163, 1996. 

[17] J. H. Lutz and E. Mayordomo. Twelve problems in resource-bounded measure. 
Bulletin of the European Association for Theoretical Computer Science, 68:64-80, 

1999. 

[18] E. Mayordomo. Almost every set in exponential time is P-bi-immune. Theoretical 
Computer Science, 136(2) :487-506, 1994. 

[19] M. Mundhenk. NP-hard sets have many hard instances. In Mathematical foun- 
dations of computer .science 1997, pages 428-437. Springer- Verlag, 1997. 

[20] P. Orponen, K. Ko, U. Schoning, and O. Watanabe. Instance complexity. Journal 
of the Association of Computing Machinery, 41:96-121, 1994. 




Simulation and Bisimulation over 
One-Counter Processes 



Petr Jancar*^ , Antonm Kucera**^, and Faron Moller***^ 

^ Technical University Ostrava, Czech Republic (Petr . JancarOvsb . cz) 
^ Faculty of Informatics MU, Czech Republic (tony@f i .muni . cz) 

^ Uppsala University, Sweden (fm@csd.uu.se) 



Abstract. We show an effective construction of (a periodicity description of) the 
maximal simulation relation for a given one-counter net. Then we demonstrate 
how to reduce simulation problems over one-counter nets to analogous bisimula- 
tion problems over one-counter automata. We use this to demonstrate the decid- 
ability of various problems, specifically testing regularity and strong regularity of 
one-counter nets with respect to simulation equivalence, and testing simulation 
equivalence between a one-counter net and a deterministic pushdown automaton. 
Various obvious generalisations of these problems are known to be undecidable. 



1 Introduction 

In concurrency theory, a process is typically defined to be a state in a transition system, 
which is a triple T = (S, L, -^) where S is a set of states, L is a set of actions (assumed 
to bQ finite in this paper) and ^CSxZxSisa transition relation. We write s A t 
instead of (s, a, t) G — and we extend this notation in the natural way to elements of 
L*. A state t is reachable from a state s iff s A t for some w G L*. T is image-finite 
iff for all s G S and a G L the set (t : s A t} is finite; T is deterministic if each such 
set is of size at most 1 . 

In this paper, we consider such processes generated by one-counter automata, non- 
deterministic finite- state automata operating on a single counter variable ranging over 
the set N of nonnegative integers. Formally this is a tuple M = (Q, L, 6^, 6^) where 
Q is a finite set of control states, L is a finite set of actions, and 6^ : Q x L ^ 
V[Q X {0, 1}), 6^ : Q X L — > 'P(Q X (—1 , 0, 1}) are transition functions (where 'P(A) 
denotes the set of subsets of A). 6^ represents the transitions which are enabled when 
the counter value is zero, and 6^ represents the transitions which are enabled when the 
counter value is positive. M is a one-counter net iff Vq G Q,Va G L : 6^(q, a) C 
6^(q, a). To M we associate the (image-finite) transition system Tm = (S,I, 
where S = {p(n) : p G Q, n G N} and is defined as follows: 
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p(n) A p'(n -|- i) iff 



n=0, and (p', i) G 6 (p, a); or 
n>0, and (p', i) G 5^(p, a). 



Note that any transition increments, decrements, or leaves unchanged the counter value; 
and a decrementing transition is only possible if the counter value is positive. Also 
observe that when n>0 the transitions of p (n) do not depend on the actual value of n. 
Finally, note that a one-counter net can in a sense test if its counter is nonzero (that is, 
it can perform some transitions only on the proviso that its counter is nonzero), but it 
cannot test in any sense if its counter is zero. 

As an example, we might take Q = (p), L = (a, z), and take the only non-empty 
transition function values to be 6^(p, a) = {(p, -fl ), (p, — 1 )}, 5^(p, a) = {(p, -|-1 )}, 
and 6^(p,z) = {(p,0)}. This one-counter automaton gives rise to the infinite-state 
transition system depicted in Fig. d if we eliminate the z-action, then this would be 
a one-counter net. The class of transition systems which are generated by one-counter 
nets is the same (up to isomorphism) as that generated by the class of labelled Petri 
nets with (at most) one unbounded place. The class of transition systems which are 
generated by one-counter automata is the same (up to isomorphism) as that generated 
by the class of realtime pushdown automata with a single stack symbol (apart from a 
special bottom-of-stack marker). 

Given a transition system T = (S, L, ^), a simM/aft’on is a binary relation 7?^ C SxS 
satisfying: whenever (s, t) G 7?., if s A s' then t A t' for some t' with (s', t') G TZ. 
s is simulated by t, written s t, iff (s, t) G 7?. for some simulation TZ\ and s and t are 
simulation equivalent, written s t, iff s t and t s. (The relation =^, being the 
union of all simulation relations, is in fact the maximal simulation relation.) A bisim- 
ulation is a symmetric simulation relation, and s and t are bisimulation equivalent, or 
bisimilar, written s ~ t, if they are related by a bisimulation. Simulations and bisimu- 
lations can also be used to relate states of different transition systems; formally, we can 
consider two transition systems to be a single one by taking the disjoint union of their 
state sets. 

There are various other equivalences over processes which have been studied within 
the framework of concurrency theory; an overview and comparison of these is presented 
in Each has its specific advantages and disadvantages, and consequently none is 
universally accepted as the “best” one, although it seems that simulation and bisimula- 
tion equivalences are of particular importance as their accompanying theory has been 
intensively developed. Bisimilarity is especially mathematically tractable, having the 
best polynomial-time algorithms over finite-state transition systems (while all language- 
based equivalences by comparison are PSPACE-complete), and the only one which is 
decidable for various classes of infinite-state systems such as context-free processes and 
commutative context-free processes (see [E! for a survey of such results). 

Let s be a state of a transition system T and « be an equivalence over the class 
of all processes (that is, all states of all transition systems), s is ^-regular, or regular 
w.r.t. «, iff s « f for some state f of a finite-state transition system; and s is strongly 
^.-regular, or strongly regular w.r.t. «, iff only finitely many states, up to «, are reach- 
able from s. Eor bisimilarity, these two concepts coincide, but this is not true in general 
for other equivalences. For example, the state p(0) of the infinite-state transition sys- 
tem depicted in Fig. |I]is -regular, being simulation equivalent to the state U of the 
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Fig. 1. A one-counter automata process and a simulation-equivalent finite-state process. 



depicted finite-state system. However, it is not strongly =^^-regular (nor — regular) as 
p (i) 7 ^ p ( j ) whenever i < j . The conditions of regularity and strong regularity say that 
a process can in some sense be finitely represented (up to the equivalence); in the first 
case there is an equivalent finite-state process; and in the second case the quotient of 
its state-space under the equivalence is finite. As all “reasonable” process equivalences 
are preserved under their respective quotients |8] (that is, each state is equivalent to its 
equivalence class in the automaton produced by collapsing equivalent states d), strong 
regularity in fact guarantees the existence of a finite-state process whose state-space is 
the same (up to the equivalence); this process provides a more robust description of the 
original process as it preserves strictly more logical properties than a process which is 
just equivalent (3). 

Finite descriptions of infinite- state processes are important from the point of view 
of automatic formal verification. Verification tools typically work only for finite-state 
systems, and the types of systems which they analyze, such as protocols, are typically 
semantically finite-state. However, these systems are often expressed syntactically as 
infinite-state systems, for example maintaining a count of how many unacknowledged 
messages have been sent, so it is advantageous to develop algorithms which replace 
infinite-state processes with equivalent finite-state systems (when they exist). Examples 
of such algorithms appear in II2I4I.'5I8I1 II 

In Section 2 we show an effective construction of (a periodicity description of) the 
maximal simulation relation for a given one-counter net. Then, in Section 3, we study 
the connection between simulation and bisimulation relations, and demonstrate the de- 
cidability of the =^-regularity and strong =^-regularity problems for one-counter nets, 
a restricted form of Petri nets; the =^:^-regularity problem is reduced to the — regularity 
problem for the more general class of one-counter automata, which is known to be de- 
cidable m- Note that the -regularity problem is known to be undecidable for general 
Petri nets o and an incomparable class of PA processes nini- Finally, we demonstrate 
how to decide simulation equivalence between (a process related to) a one-counter net 
and (a process related to) a deterministic pushdown automaton. Here note that simula- 
tion equivalence between a (nondeterministic) one-counter automaton and a determin- 
istic one-counter automaton (i.e., a special deterministic pushdown automaton) can be 
demonstrated to be undecidable 



2 Simulation on One-Counter Nets 

In this section we fix a one-counter net with control state set Q, and present an algorithm 
which constructs a (simple) description of the set 



S = { (p(m), q(n)) : p, q G Q, m,n € N, p(m) q(n)} 
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i.e., the maximal simulation relation on the transition system associated to the net. S can 
he viewed as a collection of |Qp subsets of NxN: to each p, q G Q we associate 
'5(p,q) ={(m,n) : p(m) q(n) }. Observe that if p(m) q(n) then p(m') 
q(n') for all m'<m and n'>n since the set { (p(m'), q(n')) : p(m) ^ q(ri) for some 
m>m', n<n' } is a simulation relation. 

By a co/oMring we mean a function C : (QxQ) (Nx {black, white}, where 
we write the function applications as C^p q) (m, n). We further stipulate that a colour- 
ing must satisfy the following monotonicity condition: if C^p,q) (m, n)=black then 
C(p q) (m', n')=black for all m'<m and n'>n. With this proviso, each C(p q) is 
determined by the frontier function q) : N N U (tu) defined by: q}(^) = 

min(m : C(p,q) (m, n)=white); we put ^^(n)=tu if C(p^q) (m, n)=black for 
all m. Note that this function is nondecreasing, i.e., each step f^p q^ (n+l 
is nonnegative. When f(p,q) (n) G N, we call the pair (f(p,q) (tt), n) a frontier point 
and the set of all frontier points constitutes the frontier (in C^p q^ ). 

We use G to denote the following distinguished colouring: 



The observation about S from above confirms that this is a valid colouring, i.e., that the 
required monotonicity condition holds. We use f(p,q) to denote the frontier function of 
G^p q) , and we understand the terms frontier function wd frontier to be related to G 
when not specified otherwise. 

The following “Belt Theorem” gives a crucial fact about frontiers; by a belt we 
mean the set of points of the (first quadrant of the) plane lying between two parallel 
lines. 

Belt Theorem. Every frontier lies within a belt with nonnegative rational or infinite 
slope. 

This theorem is central for the decidability of simulation over one-counter nets. It was 
proven in {Si by a combination of short and intuitive arguments; the theorem is also 
present (though not so explicitly) in IQ but the proof outlined there is formidable. 

Note that if, for a frontier function f, f (n)=tu for some n then the respective frontier 
is finite and lies within a horizontal belt (i.e., with slope 0). Otherwise f (as a function 
N ^ N) is almost linear, though its steps (f (n+l )— f (n)) need not be constant. Nev- 
ertheless, we shall show that f is periodic, i.e., from some no a finite sequence of steps 
is repeated forever; and moreover, its periodicity description — i.e., no, the sequence 
of steps to be repeated, and the values of f (n) for all n<no — can be effectively com- 
puted, yielding the simple description of the set S. (Note that the decision algorithms in 
both and O only approximate the set S, or equivalently the colouring G, to a suf- 
ficient level to answer the relevant question; effective constructability of the functions 
f^p q) does not follow from there.) 

We now show how the frontier functions f (p,q) can be stepwise approximated. First 
we say that a point (m, n) (in Nx N) is locally correct in a colouring C iff the following 
holds for all p, q G Q: if C^p q^ (m, n)=black and p(m) A p'(m') then there is 
q(n) A q'(n') with C^p/ q;^(m',n')=black. Note that the local correctness of a 
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point (m, n) depends only on the restriction of C to the neighbourhood of (m, n), 

i.e., to the set { (Trt',n') : Im'— m|<1 , |n'— n|<1 }; this follows from the fact that a 
transition in a one-counter net can change the counter value by at most 1 . We say that 
C is k-admissible, where k G N U {cu}, iff each point (m, n) with m, n < k is locally 
correct in C. In particular, note that G is tu-admissible. 

The function : (QxQ) (Nx {black, white} defined by 

Gfp q> “ black iff C^p q) (m, n) = black for some k-admissible colouring C 

is easily seen to be a k-admissible colouring, and is in fact the maximal (i.e., maximally- 
black) Vradmissible colouring; furthermore, the maximal tu-admissible colouring G^' 
is clearly G. For k G N, we denote the frontier function of G^^ q)’ 

that the range of f^^ is { 0, 1 , . . . , k— 1 } U (tu) and that f^^ (n) = tu for all n>k. 

The description of each function i.e., (a table of) its values for 0,1,... ,k— 1, 

is effectively computable, for example, by an exhaustive search. As G'^ is i-admissible 
forany i<k, we have, for each p, q, f°p q^>f^p q^>f^p q^> . . . >f(p,q) (where f'>f" 
means Vn G N : f^(n) > f "(n)). Therefore the function 9(p,q) = lim,^_,oo 
well-defined, and g(p,q)>f(p,q). But since the colouring defined by these limit func- 
tions 9(p,q) (as the frontier functions) is tu-admissible (recall the “locality” of the 
local correctness condition), and G is the maximal tu-admissible colouring, we have 
9(p,q)<f(p,q)- Thus 9(p,q)=f(p,q), and therefore we get the following. 

Lemma 1. For each n G N there A k > n such that each coincides with f(p,q) 

on the set (0,1,2,... , n). 

Our algorithm will construct G'^ for k = 0,1, 2,...; Lemma Q] guarantees that 
larger and larger initial portions of (the graphs of) G(p q) are appearing during the run 
of the algorithm (though we do not know the extent of the portion of G in G'^). To show 
when our algorithm can terminate, recognizing an initial portion of G and providing 
a description of the whole G, we now explore a certain “repeatable pattern” which is 
guaranteed to appear in G. 

By the Belt Theorem, we can fix a set of belts with nonnegative rational or infinite 
slopes such that each frontier is contained in one of them. We assume that the belts are 
“sufficiently” thick; thus we can, for instance, suppose that the belt slopes are pairwise 
distinct (merging parallel belts into one thicker). 

Now we can choose Hi , h. 2 , i G N, where 0<Hi <h 2 <i, such that (see Fig. 0: 

1. for each frontier function f with f(h 2 )<tu, all frontier points (f(n),n) between 
levels Hi and H 2 , (i.e., with Hi <n<H 2 ) lie in one of the belts (this follows trivially 
from our assumption; note that Fig. 0 depicts just one frontier in each belt, though 
in general there can be several frontiers in a single belt); 

2. the belts are pairwise disjoint at and above level Hi— 1 (i.e., we choose Hi large 
enough so that at level Hi —1 each belt is to the right of any other belt with greater 
slope); 

3. for each frontier function f: iff(Hi— 1)<1 then f (H 2 )=f (Hi — 1 ); and if f (H 2 )=tu 
then f (Hi —1 )=cu (this is satisfied when Hi and H 2 are chosen large enough); 
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Fig. 2. Graphs of G(p,q) displaying a repeatable pattern, superimposed onto each other 



4. for each frontier function f and each n<h, 2 : if f(n)<tu then f(n)<i (this is satis- 
fied by choosing i large enough after the choice of Hi and Hi). 

Each frontier point (f (n), n) has a certain (horizontal) distance to the left border line 
of the belt in which it lies. Since the slope of each belt is rational, it is clear that such 
distances range over finitely many possible values. So, by a straightforward use of the 
pigeonhole principle, we can additionally suppose (i.e., we could choose Hi , Hi, i so) 
that the frontier points of all frontiers inside a single belt have the same relative positions 
at levels Hi and Hi— 1 as at levels Hi and Hi —1 , respectively. More precisely: 

5. for each frontier function f with f(H 2 )<tu, the slope of the belt in which the re- 
spective frontier appears between levels Hi and Hi is (Hi— Hi (/(((Hi)— f(Hi )); 
moreover, f (Hi)— f (Hi— 1 ) = f(Hi )— f(Hi —1 ) 

The number of possible distances would allow us to calculate a bound b such that we 
can even suppose (i.e., choose so) that Hi— Hi <b. Note that b does not depend on 
how thick the belts are chosen. In particular, we can assume each belt to be so thick 
that for each frontier point (f (n), n) in the belt, with n>Hi , the point (f(n), n+b) is 
still an interior point of the belt, i.e., its whole neighbourhood lies in the belt. Infor- 
mally we say that the belt has a sufficiently thick monochromatic left subbelt (above Hi ); 
monochromatic means that each G(p,q) is constant (either black or white) on the sub- 
belt. Therefore we could choose belts and Hi , Hi and i so that the following additional 
condition is satisfied: 
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6. for each frontier point (f(n), n) with Hi <n<H2, the point (f(n), n+(H2— Hi )) is 
an interior point of the belt in which the respective frontier lies between levels Hi 
and H2. 

We now say that a colouring C has a repeatable pattern, based on Hi , H2 and i, iff 
there are belts such that the above conditions 1.-6. are satisfied (where the terrm fron- 
tier and frontier function are understood as those related to C). We have thus demon- 
strated that G has a repeatable pattern. Our algorithm which constructs , . . . 

terminates when it finds some G’ which has a repeatable pattern based on some Hi , H2 
and i with i<j; such a condition is clearly decidable; and LemmaQ] together with the 
fact that G has a repeatable pattern, guarantees termination of the algorithm. Having 
discovered a repeatable pattern for G’ based on Hi , H2 and i with i<j, we define the 
colouring G* by defining its frontier functions inductively as follows; 



= 1 


\ ^<P,q> 


(n), 


if n < H2 


1 f(p,q) 


(n-c) -F d, 


if n > H2 


where c = H2— Hi and d = q)(h-2) 




. Hence each is periodic. 



arising from by repeating the sequence of steps between Hi and H2 forever. Also 

note that if (n)=tu for some n<H2 then ~^\v q) ’ show (LemmaOl) 

that G* is in fact G. To this end, we make some considerations and introduce some 
auxiliary notions. 

First recall that the local correctness of a point (m, n) in a colouring C depends 
only on the restriction of C to the neighbourhood of (m, n). Also recall that the pos- 
sible transitions from a state p(m) do not depend on m when m>0. Therefore G* is 
surely co-admissible: each point (m, n) in the verified area, i.e., with m<) and n<H2, 
is locally correct since it is (by definition) locally correct in G’ , and G’ and G* co- 
incide on the neighbourhood of (m, n). Furthermore, each point outside the verified 
area obviously has a corresponding point in the verified area whose neighbourhood is 
coloured identically. By the fact that G is the maximal cu-admissible colouring, we 
have f*^ q)<f(p,q). Since f(p,q) , we have f^p,q) (n)=f(p,q) (n) for all n<H2 

(where coincides with The only possibility that G* and G are not equal 

is if ('tt)<f(p,q) ("n.) for some n>H2. Due to the next result (Lemma|3), this will 
be lead to a contradiction in the proof of Lemma 0 

Let V = (vi,V 2 ) G ZxZ be a vector with integer entries. A point (m, n) G 
NxN with m-Fvi , n-|-V 2 > 0 is lit by v in G(p,q) iff G(p^q) (m, n) = black and 
G(p,q) (m-|-vi , n-|-V 2 ) = white; if (m, n) is lit by v in some G(p^q) , then we say that 
(m, n) is /if fey V. For points (m, n), (m',n') G NxN we write (m, n) G4v (m',n') 
iff both are lit by v, and |m— m'| < 1 and |n— n'| < 1 . The transitive closure of 
is denoted by gg*. Note that (m, n) gg* (m',n') can be demonstrated by giving a 
trajectory, a sequence of points (mo , no) , (mi , ni ) , . . . , (mk, nk) such that 



(m,n) = (mo, no) GGp (mi,ni) GGp • • • GGp (mk,nk) = (m',n'). 
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Lemma 2. Let H>0 and v = (vi , Vi) with vi <0 and V2<0. If a point (mo , no) with 
no+V2 > H is lit by v then there is a point (mQ,nQ) with nQ+Vi = h, such that 
(mo, no) (m(|,n(|). 

Proof. Suppose (mo, no) satisfies the assumption but there is no required (mQ,no); 
thenn'-|-V2 > Hforeach (m',n') such that (mo, no) (m',n'). Define the colour- 
ing G by 

^^(p,q)(m, n) = black iff (m, n) = black, or 

(m— Vi ,n— V2) is lit by V in G(p,q) and 
(mo, no) (m-vi,n-V2). 

G obviously satisfies the monotonicity property of colourings, and we can easily check 
that each point is locally correct in G. Hence G is tu-admissible, which contradicts the 
fact that G is the maximal to-admissible colouring. □ 

Lemma 3. G* is equal to G. 

Proof. We have already shown that each f*^ coincides with f(p,q) on the set 
(0,1,2 ,..., h.2}, so we only have to exclude the possibility that f^^ (n)<f^p q^ (n) 

for some n>h,2. 

Recall that our algorithm stops by finding a repeatable pattern, for Hi , h.2, i-, in G^ 
(i<i). Let us fix a corresponding set of belts required by the definition of a repeatable 
pattern (note that each frontier of G* lies in one of the belts above Hi ). 

We say that a belt B is valid iff G* coincides with G when restricted to B. (In 
particular, the horizontal belt, if it was chosen, is surely valid.) If all belts are valid, 
then surely G* is equal to G. Otherwise, let B be the rightmost belt (i.e., the belt 
with the least slope) which is not valid. Consider an invalid point (mo, no) in B, i.e., 
G*.p q) (nt-o, no) = white and G(p,q) (mo,no)=black, for somep, q; moreover we sup- 
pose no to be minimal (i.e., B is valid below no). Note that no>H2. 

Let a be the slope of B, and let v = (vi ,V2), where vi = (Hi — H2)/oc and V2 = 
Hi —^2 (v corresponds to the “period of B” in G* ; see Fig. n. Due to the choice of v (as 
the period of B) we have G*^ (mo+vi , no+V2) = white, and since B is valid below 
no, we have G(p,q) (mo+Vi ,no+V2) = white. This means that the point (mo, no) is 
lit by V in G(p q) . Due to Lemma|3(for Hi in the place of H) there is a point (mQ, n^,) 
(lit by v) such that (mo, no) (m^, n^,) and riQ-|-V2=Hi , i.e., ng=H2. Recall that 
the restrictions of G* and G to N x (0, 1 , 2 , . . . , H2} coincide. Hence if there is no 
belt to the right of B then there is clearly no point (m',H2) which would be lit by 
V. Otherwise let B' be the first belt to the right of B. Any point (m', H2) which is lit 
by V can lie only in, or to the right of, B'. Nevertheless any trajectory demonstrating 
(mo, no) (m', H2) would have to cross the (sufficiently thick) monochromatic left 
subbelt of (the valid) B ', which is impossible. (The first point on such a trajectory which 
is in B ', and is thus not an interior point of B ', cannot be lit by v.) □ 

We can summarize the preceding argument in the following. 

Theorem 1. There is an algorithm which, given a one counter net, constructs a descrip- 
tion of the respective maximal simulation relation; more concretely, it gives periodicity 
descriptions for the corresponding frontier functions. 



342 



Petr Jancar, Antonin Kucera, and Faron Moller 




Fig. 3. The assumption G 7^ G* leads to a contradiction. 



3 Applications 

In this section we show how Theorem[I]can be applied to obtain new decidability results 
for one-counter nets. The following one comes almost for free. 

Theorem 2. The problem of strong regularity of one-counter nets is decidable. 

Proof. Letp(i) be aprocess of the one-counter net N = (Q, L, 6^, 5^). Define the set 
AI = (q G Q I p(i) q(j) for infinitely many j G N}. Observe that A4 is effectively 
constructible using standard techniques for pushdown automata. As Q is finite, we see 
that p(i) can reach infinitely many pairwise non-equivalent states iff there is q G 
such that for every i G N there is some j > i such that q(j) ^ q(i). In other words, 
p(i) is not strongly regular w.r.t. simulation equivalence iff there is q G such that 
the frontier function f(q,q) has no cu-values (Vn G N : f(p,q) (ti) < tu). □ 

Next we show that a number of simulation problems for processes of one-counter nets 
can be reduced to the corresponding bisimulation problems for processes of one-counter 
automata. In this way we obtain further (original) decidability results. The basic tool 
which enables the mentioned reductions is taken from lllTI and is described next. 

For every image-finite transition system T = {S,Act, — >) we define the transition 
system S(T) = (S,Act, where is given by 



s A t iff s A t and VuGS:(sAuAt=^u) 



u t 
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Note that B[J) is obtained from T by deleting certain transitions (preserving only the 
“maximal” ones). Also note that T and B[J] have the same set of states; as we often 
need to distinguish between processes “s of T” and “s of B[jy\ we denote the latter 
one by sg. A proof of the next (crucial) theorem, relating simulation equivalence and 
bisimulation equivalence, can be found in mni. 

Theorem 3. Let s and t be processes of image -finite transition systems T and J', re- 
spectively. It holds that s sg and t tg; moreover, s t iff ss tg- 

The next theorem provides the technical basis for the aforementioned reductions. 

Theorem 4. Let N foe a one-counter net. Then the transition system B[Tff\ is effectively 
definable within the syntax of one-counter automata, i.e., one can effectively construct 
a one-counter automaton M such that Tm is isomorphic to ,B(Tn). Moreover, for ev- 
ery state s = p(i) o/Tn we can effectively construct a state p^(i^) of 1 m. which is 
isomorphic to the state sg o/';B(Tn). 

Proof. Let N = (Q,L,6^,5^)bea one-counter net, and let i— > be the transition rela- 
tion of ;B(Tn). Let us define the function Max Than : Q x Z x N^P(Q X {-1,0,1}) 
as follows: 



(q, i) € Max7>on(p, a, i) iff p(i) A q(i -|- )) 

where is the transition relation of ,B(Tn). In fact, MaxTran(p, a, i) represents all 
“maximal” a-transitions of p(i). Our aim is to show that the function MaxTran is, in 
some sense, periodic — we prove that there (effectively) exists n > 0 such that for all 
p G Q, a € Z, and i > n we have that MaxTran(x> , a,i) = MaxTran[p , a,i -f n). It 
clearly suffices for our purposes because then we can construct a one-counter automaton 
M = (Q X (0, ... ,n — 1), Z,Y^,Y^) where and y^ are the least sets satisfying 
the following conditions: 

- ifp(i) A q(j) whereO < i,j < n, then ((q,)),0) G Y^((p,i-), a) 

- ifp(n- 1) A q(n),then ((q,0),+l) G Y^((p,n - 1), a) 

- ifp(n-l-i) A q(n-|-)) where 0 < i,i < n, then ((q,j),0) G Y^((p,i-), a) 

- ifp(n) A q(n- l),then ((q,n- 1),-1) GY^((p,0),a) 

- if p(2n - 1) A q(2n), then ((q,0),+l) G y^((p. " n - I), a) 

Note that the definition of M is effective, because the constant n can be effectively 
found and for every transition p(i) A p(j) of Tn we can effectively decide whether 
p(i) A p(j) (here we need the decidability of simulation for one-counter nets). The 
fact that Tm is isomorphic to ^(Tn) is easy to see as soon as we realize that ;B(Tn) 
can be viewed as a sequence of “blocks” of height n, where all “blocks” except for 
the initial one are the same. The structure of the two (types of) blocks is encoded in 
the finite control of M, and the number of “current” blocks is stored in its counter (see 
Fig. a. Note that M indeed needs the test for zero in order to recognize that the initial 
block has been entered. 

Now we show how to construct the constant n. First, we prove that for all p G Q, 
a G Z one can effectively find two constants k(p, a) and I(p, a) such that for every 



344 



Petr Jancar, Antonm Kucera, and Faron Moller 




Fig. 4. The structure of Tn (left) and ;B(Tn) (right) 



i > k(p, a) we have MaxTran[\>, a,i) = MaxTranlp, a, i + l(p, a)). We start by 
reminding ourselves that the out-going transitions of p (i) and p ( j ) , where i, j > 1 , are 
the “same” in the following sense (see Fig.0: 

p(i)-Aq(i + m) iff p(j) A q() -F m) iff (q,m) G 6>(p, a). 

Hence, the set Max7>an(p, a, i), where i > 1 , is obtained by selecting certain elements 
from 6^(p, a). In order to find these elements, we must (by the definition of B[J]) 
take all pairs ((q, m), (r, n)) G 6^(p,a) x 6 ^(p, a), determine whether q(i -|- m) 
r(i + n), and select only the “maximals”. For each such pair ((q, m), (r, n)) we define 
an infinite binary sequence S as follows: 5(i) = 1 if G(q,r> (i + m, i + n) = black, 
and 5(i) = 0 otherwise. As (a description of) G(q,r) can be effectively constructed, 
and the frontier function f(q,r) is periodic (see Theorem HJ), we can conclude that 
S = where a, (3 are finite binary strings. Note that a and (3 can be “read” from 
the constructed description of <G^q and thus they are effectively constructible. As 
6^(p, a) is finite, there are only finitely many pairs to consider and hence we obtain 
only finitely many a’s and (3’s. Now we let k(p, a) be the length of the longest a, and 
let l(p, a) be the product of lengths of all (3’s. In this way we achieve that the whole 
information which determines the selection of “maximal” elements of 5^(p, a) during 
the construction of MaxTranlp , a, i) is periodic (w.r.t. i) with period l(p, a) after a fi- 
nite “initial segment” of length k(p, a). Let K = max{k(p, a) | p G Q, a G L), and 
^ = OpGQ.QGL Up, a). Finally, let n = K • L. 

To finish the proof, we need to show that for every state s = p(i) of Tn one can 
construct a state p'(i') of Tjvi which is isomorphic to the state sg of K(Tn)- This is 
straightforward; we simply take p ' = (p, i mod n) and i' = i div n. □ 

Two concrete examples of how Theorems 0 and @1 can be applied to obtain (new and 
nontrivial) positive decidability results on one-counter nets are given next. 

Corollary 1. The problem of regularity of one-counter nets is decidable. 
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Proof. It suffices to realize that a process s of a transition system T is =^^-regular iff 
the process sg of S(T) is — regular. As — regularity is decidable for processes of one- 
counter automata |0|, we are done. □ 

Corollary 2. Let Tpoi be a process of a deterministic pushdown automaton F and q(i) 
be a process of a one-counter net N. The problem whether p oc q (i) is decidable. 

Proof. First, realize that if T is a deterministic transition system then B(J] = J. Hence, 
poc=^q(i) iffpa~ q^(i^ whereq'(i') is the process ofTheorem^ As one-counter 
automata are (special) pushdown automata, we can apply the result of Cl which says 
that bisimilarity is decidable for pushdown processes. □ 

The previous corollary touches, in a sense, the decidability/undecidability border for 
simulation equivalence, because the problem whether pa q(i) where pa is a pro- 
cess of a deterministic PDA F and q(i) is a process of a one-counter automaton M is 
undecidable Q (in fact, it is undecidable even if we require F to be a deterministic 
one-counter automaton). 
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Abstract. We present a global and comprehensive view of the proper- 
ties of subclasses of two counters automata for which counters are only 
accessed through the following operations: increment (-1-1), decrement 
(— 1), reset (c := 0), transfer (the whole content of counter c is transfered 
into counter d), and testing for zero. We first extend Hopcroft-Pansiot’s 
result (an algorithm for computing a finite description of the semilinear 
set post*) to two counters automata with only one test for zero (and 
one reset and one transfer operations). Then, we prove the semilinearity 
and the computability of pre* for the subclass of 2 counters automata 
with one test for zero on ci, two reset operations and one transfer from 
Cl to C 2 . By proving simulations between subclasses, we show that this 
subclass is the maximal class for which pre* is semilinear and effectively 
computable. All the (effective) semilinearity results are obtained with 
the help of a new symbolic reachability tree algorithm for counter au- 
tomata using an Acceleration function. When Acceleration has the 
so-called stability property, the constructed tree computes exactly the 
reachability set. 



1 Introduction 

Context. The highly successful model-checking approach for finite systems is 
mainly a consequence that finite state automata enjoy a lot of good properties 
like strong relations with logics (first order and monadic second order, linear 
temporal logic (LTL)), decidability and complexity results, efficient algorithmics, 
closure properties, etc. Thus, finite automata are a nice framework for the model- 
checking of programs which operate on bounded and finite variables. But the 
knowledge of the domains of variables is sometimes impossible to compute and 
this is still more true when the domain is infinite. 

Before we survey the decidability results about extended automata, we give 
a precise view of the properties we are interested in. One of the most important 
class of properties is called safety properties. Model-checking safety properties 
often reduces to the effective computation of the set of predecessors (pre* ) or / and 
the set of successors (post*). We thus want to find an infinite-state model for 
which the pre* or/ and post* images always belong to a “good” class. In our 
framework, these two sets are two (infinite) subsets of Q x N" (where Q is 
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the finite set of control states and n is the number of counters on which the 
automaton operates). Semilinear sets of Q x N” form a “good” class since (1) 
they are closed under n and under projection, (2) inclusion is decidable, and 
(3) they are also the sets expressible in Presburger arithmetics over integers and 
also the set of regular subsets in the commutative monoid Q x N”. Hence, our 
objective is to find an extension of finite automata for which pre* and/or post* 
are semilinear and, if possible, effectively computable. 



Related work. Automata with counters, but without any zero-testing primitive, 
are known as Vector Addition Systems with States (VASS) and they are equiva- 
lent to Petri nets. The reachability problem is thus decidable but the comparison 
between two reachability sets is undecidable. Moreover, in dimension n, n > 2, 
the reachability set is not always semilinear EEZi. 

Stack automata are another well-known extension of finite automata. They 
have regular reachability sets (in Q x A*); this property has been recently redis- 
covered ffiS8ti| K ;a,uf)2f IF'W Wt)7l Itit^lVIt)?^ and used for model-checking of stack 
automata. Of course, stack automata contain 1-counter automata (with the ca- 
pability of zero-testing). At last, it has been proved that lossy FIFO automata 
have regular non-computable reachability sets pc ;iPt)tij while half-duplex and 
quasi-stable fifo automata have effectively computable regular reachability sets 
PEHlI. For flat counters automata (flat means that there is no nested loop in 
the control transition graph), the reachability relation can be expressed in the 
Presburger logics unnH). 

A result of Hopcroft and Pansiot pfP79] said that the reachability set of 
a 2-dim Vector Addition Systems with States (2-dim VASS) is semilinear and 
that it is effectively computable. Hopcroft and Pansiot gave an algorithm which 
computes a description of the semilinear reachability set. The complexity of 
Hopcroft-Pansiot’s algorithm has been studied later by Rosier and Yen [OTRHYBBj 
who showed that the deterministic complexity is in 2^ where n is the size of 
the automaton with its initial state. 

More recently, some papers appeared dealing with verification of reset /transfer 
Petri nets (or counters automata for which counters are only accessed through 
the following operations: increment (-1-1), decrement (-1), reset (c := 0), transfer 
(the whole content of counter c is transfered into counter c'). For these weak 
counter automata (that do not have a full-fledged test for zero), reachability 
of a control state is decidable for reset/transfer Petri nets, thus also for coun- 
ters automata with the following operations {-1-1, -1, reset, transfer} |l IFShSj . 
Reachability and boundedness are undecidable in dimension 3 (since there are 
three counters) |l ll'ShS] ; the limit between decidability /undecidability for the 
boundedness problem is precised as follows: boundedness is decidable for {-1-1, 
-1, reset {-counters automata for which only two counters can be reset EMj. 
Reachability is decidable for counters automata such that only a unique counter 
can be tested for zero !b^ . These extensions except pReiflhj do not contain 
the complete test for zero. 

These results from the litterature do not give a global and comprehensive 
view of the properties of classes of automata with two {-1-1, -1, reset, transfer. 
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zero-test}-counters. Can we extend the result of to more powerful models 

of two counters automata ? Are pre* and/or post* also semilinear for extended 
automata ? 



Our contribution. We present three types of results: 

(1) A new general symbolic reachability tree (semi-) algorithm for automata 
with {+1, -1, reset, transfer, zero-test /-counters. 

This algorithm is generic and modular, and it uses an Acceleration function. 
It exactly computes the reachability set (post*) when the Acceleration func- 
tion satisfies a so-called stability property. Our algorithm, given for any n-dim 
Extended VASS, both generalizes Hopcroft-Pansiot’s algorithm (given for 2-dim 
VASS) and refines Karp-Miller’s algorithm (given for n-dim VASS) [KMBflj . It 
gives an exact description of post* (as Hopcroft-Pansiot’s algorithm and not 
only a coverability set like Karp-Miller’s algorithm). We prove termination and 
correctness of this algorithm for some subclasses of automata with two {-1-1, -1, 
reset, transfer, zero-test {-counters. 

(2) A hierarchy result. 

We systematically study all the subclasses of two (-1-1, -1, reset, transfer, 
zero-test {-counters automata and we give a complete hierarchy between classes 
using new simulations. A ’’maximal decidable model” appears to be 2-counters 
automata (TiRi. 2 Tri 2 ) with the following extended operations: test for zero on 
the first counter, reset on the two counters, and, transfer from the first to the 
second counter (see Figure C] on Dage l35ti|i . 

(3) Three main technical results. 

We prove the three following theorems: pre* is an effective semilinear set 
for the class TiRi^ 2 Tfi 2 (Theorem ; post* is an effective semilinear set for 
the class TiRiTti 2 (Theorem 0) ; and, post* is a semilinear set for the class 
TiRi, 2 Tti 2 (Theorem|3). 

Plan. Section 0 introduces Extended Vector Addition Systems with States 
(E-VASS), and we present in Section 0 semilinear sets and projections. In Sec- 
tion0 we give a new symbolic reachability tree algorithm for E-VASS which uses 
as a parameter a function Acceleration and we define the stability property. 
Section 0 extends Hopcroft-Pansiot’s result (effective semilinearity of post*) to 
2-dim Extended VASS with one test for zero (and with one reset and one transfer 
operations). Section El proves the effective semilinearity oi pre* for the maximal 
class TiRi_ 2 Tti 2 . Section 0 shows that post* is still semilinear for the maximal 
class TiRi^ 2 Thi 2 - 

The proofs are technically nontrivial and are omitted here due to space con- 
straints. They can be found in the full version available form the authors and at 
the URL http://www.lsv.ens-cachan.fr/Publis/. 
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2 Extended Vector Addition Systems with States 

Let Z (resp. N, N* , Q"*" ) denote the set of integers (resp. nonnegative integers, 
positive integers, nonnegative rational numbers). If i,j G N, we write [t, j] for 
the set {fc G N / z < /c < j}. Let N" (resp. Z”) denote the set of n-tuples 
of elements of N (resp. Z). If x is an n-tuple and z G [l,zz], x{i) is the z*^ 
component of x. The z*^ unit vector is the n-tuple Ci defined by €i{j) = 0 if 
j yf z and ei(i) = 1. Operations on n-tuples are componentwise extensions of the 
usual operations and when 0 is used as an n-tuple, it denotes the all zero n-tuple. 
These operations are classically extended on sets of n-tuples (e.g. for P, P' C W , 
P + P' = {p + p' / p € P and p' G P'})- Moreover, in an operation involving 
sets of n-tuples, we shortly write v for the singleton {z;} (e.g. for P C N" and 
a: G N” , we write x + P for {x} + P). 

For every set X, we write p{X) for the set of subsets of X and \X\ for the 
cardinal of X . For any infinite sequence (a;i)ieN in X , an infinite subsequence (of 
(xi)) is any infinite sequence with zq < zi < Z 2 • • • < *fc+i ’ ' ’ • 

An ordering is any reflexive, transitive and antisymmetric relation ^ over 
some set X. A well ordering is any ordering ^ such that, for any infinite se- 
quence (xi)ieN in A, there exists an infinite increasing subsequence Xi^^ < Xi^ -< 
Xi 2 ‘ ■ ■ a^ifc ^ • • • . If ^ is an ordering on X, then for every subset Y C X, 

the set Min(y) of minimal elements of Y is defined by Min(y) = {y G T / V y' G 
^ \ {y}) y' ^ y} and it is finite when ^ is a well ordering. 

The relation < between n-tuples in Z” is defined by a; < y if for all z G [1, n] 
we have x{i) < y{i). This relation is an ordering on Z" and it is a well ordering 
on N" (it is not a well ordering on Z”). 

A labelled transition system is a structure T5 = (S', A, where S is a set of 
states, A is a finite set of actions and ^CSxAxSisa set of transitions. When 
S is finite, TS is a, finite labelled transition system. We note — > the reflexive and 
transitive closure of For every subset X of S, we write post{X) for the set 
{s G S / 3 r G A, r ^ s} of immediate successors of A, post*{X) for the 
set {s G S / 3 r G A, r A s} of successors of A, and pre*{X) for the set 
{s G S / 3 r G A, s ^ r} of predecessors of A. 

We present Extended Vector Addition Systems with States (E-VASS) in two 
steps : we first describe the finite control structure of an E-VASS and we then 
define an E-VASS as an operational semantics associated with such a control 
structure, which leads to an infinite labelled transition system. 

An n-dim E-VASS Control is any finite labelled transition system A = 
(Q,Op,^f) such that Op C {add{v) / v G Z"} U {test{i) , weaktest(i) , reset(i) , 
transfer{i^j) / z, j G [l,n],zyf j}. 

Definition 1. An n-dim E-VASS A is a labelled transition system A = {S, Op, 
^a) based on an n-dim E-VASS Control A = (Q, Op, — >a), where: 

— S' = Q X N" is the set of states, and, 

— Op is the set of actions, and, 

— the set of transitions is the smallest subset of S x Op x S verifying: 
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1 . 

2 . 

3. 

4. 

5. 



add(v) , , „ ,, , , ^ / \ add(v) , , 

if q >f^q then for all x € such that x-\-v > 0, [q, x) 



v), and 

test{i) 



f,q' then for all a; € such that x{i) = 0, {q, x) 



test(i) 



kW,x), 



ifq- 
and 

if then for all a; e N" such that x(i) = 0 and for all a £ N, 

. , weaktest(i) , , , 

(g, X) ^a\Q ^ and 

if q then for all x G N", (g, x) a:') where x'{i) = 0 

and for all j ^ i, x'{j) = x{j), and 

transfer {i^j) 



if q- 



^^q' then for all a; G N”, (g,a;) 



transfer{i—^j) 



M'a') 



where x'{i) = 0, x'{j) = x{i) + x{j) and for all k ^ {i,j},x'{k) = x{k). 



A weaktest{i) transition is the inverse of a reset(i) in the sense that we 

/ \ weaktest{i) , . ,, , . reset{i) , . ? / \ 

have {q,x) > [Q ) in (q ,x ) {q,x). We will use weaktest{i) 

transitions in sectional to compute pre* for E-VASS with reset{i) transitions. 
Notice that any e-transition may be simulated by an add{0) transition. An n- 
dim (non extended) VASS is any n-dim E-VASS A = (S,Op,^A) such that 
Op C {add{v) / V G Z”}. 

We define the following classes of 2-dim E-VASS. For every I C {1,2}, J C 
{1, 2} and AT C {(1, 2), (2, 1)}, we write T/R,/Trif for the class of 2-dim E-VASS 
A such the set of actions Op of A satisfies: 



Op C {add{v) / V G Z"j 

U {test{i),reset{j),ti"ansfer{k ^ k') / z £ 7,j £ J and {k,k') G K} 

Notice that T0 (resp. Rg, Tt 0) means that there is no zero-test transition (resp. 
no reset transition, no transfer transition) and we will omit to write T0 (resp. 
R0 , Tt 0). For instance, TiR0Tri2,2i (shortly written TiTri2,2i) is the class of 
2-dim E-VASS A where the allowed extended transitions are labelled by test{l), 
transferal —>■ 2) or transfer{2 1). 



Simulations 

Let C and D be two classes of E-VASS. We say C is effectively simulable by D 
if for every E-VASS A in C with control states Qa, there exists an E-VASS B in 
D with control states Qb computable from A such that: 

1. we have Qb Qa, where Qa is a copy of Qa, and, 

2. for any states (q,x) and {q' ,x') in A, we have {q,x) — >a W,x') iff (q,x) — >b 

(7,x'). 

We will use the notion of effective simulation (which is different from the 
classical notion of simulation) to have simpler proofs for the effective computa- 
tion of post* and pre*. Assume that C is effectively simulable by D. If there 
exists an algorithm which computes for every E-VASS in D a finite description 
oi post* (resp. pre*) then we get that there exists an algorithm which computes 
for every E-VASS in C a finite description of post* (resp. pre*). 
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Theorem 1. We have the following simulations between classes of 2 -dim E- 
VASS: 

1 . the class Ti^2 is effectively simulable by the class TiTt21; 

2 . the class TiRi 2 Tti2 is effectively simulable by the class Ti^2, 

3 . the class TiRi 2 Tti2 is effectively simulable by the class T1R2, 

4 -. the class TiRiTti2 is effectively simulable by the class Ti. 

5 . the class Ri,2Tri2,2i is effectively simulable by the class T1R2, 

3 Semilinear Sets and Projections 

Let P = {pi,p2, ■ • • ,Pfc} be a finite subset of N”. We write P* for the set P* = 
{S^^^aiPi / Vi e [l,k],ai G N}. For every a; G N”, (x + P*) is called a linear 
set. A semilinear set is a finite union of linear sets. A description of a semilinear 
set L is any finite family {{xi,Pi) / i G 1 } such that L = We 

will say that a semilinear set L is effectively computable if one can give explicitly 
an algorithm which computes a description of L. 

Let us remark that for every a; G N" and u G Z”, the set L = ((a;+P*)+u)nN"' 

is semilinear as it can be written as L = (i? + P*) where B = Min(((a; + P*) + 

u) n FI") is finite. Moreover, it can be shown that B is computable jH P7hj . 

For every i G [1, n], we write Hi for the linear set Hi = {x GW / x{i) = 0}. 
We define two kinds of projection on Hi, extended on subsets of N" in the usual 
way: 

— for every i G [l,n], proji : FI" ^ FF is the orthogonal projection on Hi 

defined by projffx) = x' where x'{i) = 0, and VA: i,x'{k) = x{k). 

— for every i,j G [l,n] such that i j, projij : FI" ^ N” is the diagonal 
projection on Hi defined hy projij{x) = x' where x'{i) = 0, x'{j) = x{i)-\-x{j) 
and Vfc ^ {hj}i x'{k) = x{k). 

4 Computation of a Symbolic Reachability Tree 

Let us consider an n-dim E-VASS A. The algorithm we propose later is based on 
the construction of a tree labelled by 3-tuples (q, x, P) where 5 is a control state 
of A, a: is in FI" and P is a finite subset of FI" . Intuitively, the label I = {q, x, P) 
of a node t (written t : 1 ) represents the set of states |Z] = {9} x (a: -F P*) of A. 

Notation. For every control state q of an n-dim E-VASS A, for every a: G FI" and 
for every finite subset P of FI", we write |g, x, P] for the set of states |g, x, P] = 
{g} X (x + P*) of A 

We are now able to define a notion of Symbolic Reachability Tree as follows. 



Definition 2. A Symbolic Reachability Tree for an n-dim E-VASS A with a set 
So of initial states is any rooted directed tree T labelled with 3-tuples (g, x, P) 
where g is a control state of A, a; G FI" and P is a finite subset of FT , such that: 
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1. we have post*(S'o) = u m, and, 

t:l node of T 

2. for any node t : I in T such that t has a son, then post(pj) C 
post*(p]) where {ti : k}i^i is the set of sons of t. 

The SymbolicTree algorithm essentially computes a reachability tree. But as 
the reachability tree of an E-VASS may be infinite in general, an Acceleration 
function is used to accelerate the computation. Similarly to the Karp-Miller 
strategy [K Mtit)] . the Acceleration function basically computes the iteration 
of a repetitive sequence. This Acceleration function may depend on the input 
E-VASS, for instance we will define specific Acceleration functions suited to 
specific classes of E-VASS. 



Algorithm 1 SymbolicTree(Al, So, Acceleration) 

Input: an n-dim E-VASS A, a linear set Sq = {go} x (ro + Po*) of initial states of A 
and an Acceleration function 
1: create root labelled {qo,xo,Po) 

2: while there are unmarked nodes do 

3: pick an unmarked node t : I, where I = (g, x, P) 

4: mark t 

5: if there is an ancestor t' : I' of t such that |1] C |1'] then 

6: skip 

7: else 

8: W(t) ^ Acceleration(t, A) 

9: {We now compute the set post{[q, x, IT]) of immediate successors of [g, x, Wj} 

10: for each transition g q' in A do 

11: for each m € Min(((r -|- W*) -I- r) Pi N” ) do 

/// TTr\r* Till! I 

12: construct a son t : (q ,m, W ) of t and label the arc t > t 

13: for each transition g q' jjj /\ (Jq 

14: if Xi = 0 then 

15: construct a son t' : (g', x,W H Hi) of t and label the arc t ^ t' 

16: for each transition g ^ g' in A do 

17: if Xi = 0 then 

18: construct a son t' : (g', x, {W r\Hi)u{ei}) of t and label the arc t t' 

19: for each transition g q' jjj (Jq 

20: construct a son t' : {q' ,proji{x),proji{W) \ {0}) of t and label the arc 

reset(i) , 

t -A t’ 

j-v- 1 . . transfer(i — ^■j) , , 

21: tor each transition g > g in A do 

22: construct a son t' : (q' ,projij{x),projij{W)) of t and label the arc 



We will often use the following stability property for the different considered 
Acceleration functions. 
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Definition 3. An Acceleration function is n-stable if for every n-dim E-VASS 
A with a linear set of initial states Sq and for every node t : {q, x, P) in the 
tree constructed by SymbolicTree(A, iSq, Acceleration), we have the following 
property: 

lq,x,P] C lq,x,W{t)j C post*{lq,x,Pj) 
where W{t) = Acceleration(f, A). 



Theorem 2. Let A be an n-dim E-VASS and So = { 90 } x (a:o + Po*) be a lin- 
ear set of initial states of A. If an Acceleration funetion is n-stable then the 
SymbolicTree algorithm, applied to (A, Sq, Acceleration), eonstruets a Sym- 
bolic Reachability Tree for A with initial states Sq ■ 

Notice that if the SymbolicTree algorithm terminates on some n-dim E- 
VASS A with a linear set of initial states So and an n-stable Acceleration 
function, then post*(5'o) is semilinear and effectively computable. Because 3- 
dim VASS can have non-semilinear reachability set [HP79IJ . we now restrict our 
study to 2-dim E-VASS. For simplicity, we will shortly write stable for 2-stable. 
In the following of the paper, our aim is to find, for each analyzed class of 2-dim 
E-VASS, a dedicated stable Acceleration function ensuring the termination of 
the SymbolicTree algorithm for this class. 



5 Effective Semilinearity of post* for the Class TiRiTti 2 

We now may generalize Hopcroft-Pansiot’s result in showing that post* is still 
semilinear and effectively computable for any E-VASS A in TiRiTr^. We ex- 
tract from Hopcroft-Pansiot’s algorithm an Acceleration//p function u which is 
stable and such that for every 2-dim VASS A with a linear set of initial states Sq, 
the SymbolicTree algorithm applied to (A, S'o, Accelerationpp) terminates. 
This Accelerationpp function is then used to define a Accelerationxi func- 
tion which allows us to show that post* is still semilinear and effectively com- 
putable for any E-VASS A in TiRiTti 2 . 

Proposition 1. The function Accelerationxi is stable. Moreover, for every 
2-dim E-VASS A in the class Ti with a linear set of initial states So, the 
SymbolicTree algorithm applied to (A, S'o, Accelerationxi) terminates. 

By using the effective simulation of TiRiTti 2 by Ti (Theorem ^), and 
Proposition n, we obtain the following theorem. 

Theorem 3. Eor any 2-dim E- VASS A in the class TiRiTti 2 with a semilinear 
set So of initial states, post*{So) is semilinear and it is effectively computable. 

^ The definition of this function is omitted here, but it can be found in the full version. 
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Algorithm 2 Accelerationxi (t : (q,x,P),A) 



Input: a node t : {q, x, P) and a 2-dim E-VASS A 
1: if the branch from the root r to t may be written as r 



add{v) 



t for some v 



then 

return Acceleration_tfp(t, A) 

else if the branch from the root r to t may be written as r ^ u 



test (1) 



t then 



if there is an ancestor t' : (q, x', P') of t such that {x — x') G {0} x N* and there 
is no extended arc except test{l) arcs between t' and t then 
return PVj{x — x'} 

else 

return P 



6 Effective Semilinearity of pre* for the Class TiRi^2Tti2 

We present in this section an algorithm for computing pre* for the largest class 
(non equivalent to Ti^2) of 2 -dim E-VASS. We first prove that post* of a 2 -dim 
E-VASS A in T1W2 is semilinear and effectively computable (accordingly to 
previous notations, we note T1W2 the class of 2 -dim E-VASS where the allowed 
extended transitions are labelled by test{l) or weaktest( 2 ) . 



Algorithm 3 AccelerationxiW 2 (f • {QiX,P),A) 

Input: a node t : {q, x, P) and a 2-dim E-VASS A 

* it)eafctest(2) 

1: if the branch rrom the root r to t may be written as r — > u t then 

2: if there is an ancestor t' : (g, x\ P') of t such that (x — x') G N* x {0} and there 

is no extended arc except weaktest{2) arcs between t' and t then 
3: return PVJ {x — x'} 

4: else 

5: return Accelerationxi (t, A) 



Proposition 2. The function AcceleratioBrr-^-w^ is stable. Moreover, for every 
2 -dim E-VASS A in the class T1W2 with a linear set of initial states Sq, the 
SymbolicTree algorithm applied to (A, S'o, AccelerationxiW2) terminates. 

Theorem 4. For any 2 -dim E-VASS A in the class TiRi_ 2 Tri 2 with a semilin- 
ear set So of initial states, pre*{So) is semilinear and it is effectively computable. 

Corollary 1. The reachability problem is decidable for the class TiRi^ 2 Tti 2 - 

7 Semilinearity of post* for the Class TiRi^2Tti2 

Let us recall that in section 0 we proved that post* is semilinear and effectively 
computable for any A in TiRiTti 2. We now extend this semilinearity result to 
the class TiRi^2Tti2- 
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Let f^{A, So) be the set defined by i^{A, So) = UsgSo for every 

state s of A, f2{A, s) is given by: 

n(A, s) = {g G Q / |{a; e N / 3 cr, s (q, a;))}| = 00 } 



where .4 is a 2-dim E-VASS with a set Q of control states and So is a set of 
initial states of A. 

The main problem to ensure termination is that we may have on a branch of 
the constructed tree infinitely many interleavings between reset(2) and test(l) 
transitions. Hence, the AccelerationxiRa function uses the set 17(A, S'o) in or- 
der to iterate some special loops in this case. Intuitively, if we have just fired 
a reset{2) transition, then we do not care about the actual value of the second 
component before the reset(2). Hence if we know that this value may be arbi- 
trarily large, then we can iterate a simple control loop that increases the first 
component before the firing of the reset{2) transition. Formally, a simple control 



add{vi) 



add{v2) 



add{vn) 



loop I in an E-VASS A is a path I = q — G qi ^ 92 • ■ ■ Qn-i — ^ g 

in the E-VASS Control A of A with no repeating state except that the first state 
and the last state are the same ; moreover we say that I is a control loop on q 
with displacement d{l) = vi + V 2 + • ■ ■ + Vn- 

Notice that the set f2{A, So) is finite, as it is a subset of Q. Hence, even if we 
actually do not know whether we can effectively compute this set, we can use 
this set as an “oracle” in our AccelerationxiRa function. 



Algorithm 4 AccelerationxiR 2 (t : (q,x,P),A) 

Input: a node t : (g, x, P) and a 2-dim E-VASS A 

-1 • r* 1 1 T n 1 1 • * reset( 2 ) 

1: it the branch from the root r to t may be written as r — > m > t then 

2: if there is an ancestor t' : (g, x', P') of t such that (x — x') G N* x {0} and there 

is no extended arc except reset{2) arcs between t' and t then 
3: return PVJ {x — x'} 

4: else if the branch from the root r to t may be written as r — > s — > t' — > 

// reset( 2 ) ... i // i 

t > t with no extended arc between t and t then 

5: if the label (q' , x' ^ P') of t' is such that q' G 0{A, So) then 

6: if there exists a node u : (r, y, Q) between t' and t" and a simple control loop 

on r with displacement (a, 6) G N* x Z then 
7: return P U {(a, 0)} 

8: else 

9: return Accelerationxi (t, A) 



Proposition 3. TAe /imcfzon AccelerationxjRa is stable. Moreover, for every 
2-dim E-VASS A in the class T 1 R 2 with a linear set of initial states So, the 
SymbolicTree algorithm applied to (A, S'o, AccelerationxiRa) terminates. 

From Theorem d the class TiRi_ 2 Tfi 2 is effectively simulated by T 1 R 2 . 
From Proposition 0 we conclude that post* is semilinear. 
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Theorem 5. For any 2-dim E-VASS A in the class TiRi_ 2 Tri 2 with a semi- 
linear set So of initial states, post*{So) is semilinear. Moreover, if f2{A, So) is 
effectively computable, then post* (S q) is effectively computable. 

It remains an open problem to know whether it is possible to compute a finite 
description of the semilinear reachability set for the class TiRi^ 2 Ti'i 2 - 

8 Conclusion 

We have systematically studied all the subclasses of two {+1, -1, reset, transfer, 
zero-test}-counters automata and given a complete hierarchy between classes 
using new simulations. It turns out that a ’’maximal decidable model” consists 
of the class of 2-counters automata (TiRi^ 2 Tti 2 ) with the following extended 
operations: test for zero on the first counter, reset on the two counters, and 
transfer from the first to the second counter. For this model, the pre* image is 
effectively semilinear. 




Fig. 1. Graph of inclusions and simulations 



Our results cannot be generalized to three {-1-1, -1 {-counters automata be- 
cause they may have non-semilinear reachability sets; moreover, reachability is 
undecidable for three j-l-l, -1, reset {-counters automata such that one of the 
counters never uses the reset operatior l pi JKSBiKIII JiilHSj . 

Our symbolic algorithm has been defined for any {-1-1, -1, reset, transfer, 
zero-test {-counters automata. It can also be used as a semi-algorithm when 
termination cannot be guaranteed. 

A lot of other open problems still arise: for example, we have no answer 
whether post* is effectively semilinear for the class T i R.i ^Trio . The complexity 
of the symbolic algorithm is certainly at least in 2^* * jITT)I -l-¥8fi| but we don’t 
know if it is primitive recursive or not. Another point which has to be stated is 
the comparison between classes of 2-counters automata and stack automata. 
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Abstract History preserving bisimilarity (hp-bisimilarity) and hered- 
itary history preserving bisimilarity (hhp-bisimilarity) are behavioural 
equivalences taking into account causal relationships between events of 
concurrent systems. Their prominent feature is being preserved under ac- 
tion refinement, an operation important for the top-down design of con- 
current systems. We show that — unlike hp-bisimilarity — checking hhp- 
bisimilarity for finite labelled asynchronous transition systems is not de- 
cidable, by a reduction from the halting problem of 2-counter machines. 
To make the proof more transparent we introduce an intermediate prob- 
lem of checking domino bisimilarity for origin constrained tiling systems, 
whose undecidability is interesting in its own right. We also argue that 
the undecidability of hhp-bisimilarity holds for finite labelled 1-safe Petri 
nets. 



1 Introduction 

The notion of behavioural equivalence that has attracted most attention in con- 
currency theory is bisimilarity, originally introduced by Park m and Milner 
concurrent programs are considered to have the same meaning if they are bisim- 
ilar. The prominent role of bisimilarity is due to many pleasant properties it 
enjoys; we mention a few of them here. 

A process of checking whether two transition systems are bisimilar can be 
seen as a two player game which is in fact an Ehrenfeucht-Fra'isse type of game 
for modal logic. More precisely, there is a winning strategy for a player who 
wants to show that the systems are bisimilar if and only if the systems cannot 
be distinguished by the formulas of the logic; the result due to Hennessy and 
Milner P|. 

Another notable property of bisimilarity is its computational feasibility; see 
for example the overview note m- Let us illustrate this on the examples of fi- 
nite transition systems and a class of infinite-state transition systems generated 
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by context free grammars. For finite transition systems there are very efficient 
polynomial time algorithms for checking bisimilarity uniiiHi, in sharp contrast 
to PSPACE-completeness of the classical language equivalence. For transition 
systems generated by context free grammars, while language equivalence is un- 
decidable, bisimilarity is decidable 0, and if the grammar has no redundant 
nonterminals, even in polynomial time m- Furthermore, as the results of 
indicate, bisimilarity has a very rare status of being a decidable equivalence for 
context free grammars: all the other equivalences in the linear/branching time 
hierarchy are indeed undecidable. The algorithmic tractability makes bisim- 
ilarity especially attractive for automatic verification of concurrent systems. 

The essence of bisimilarity, quoting jOj , “is that the behaviour of a program is 
determined by how it communicates with an observer.” Therefore, the notion of 
what can be observed of a behaviour of a system affects the notion of bisimilar- 
ity. An abstract definition of bisimilarity for arbitrary categories of models due 
to Joyal et al. H2] formalizes this idea. Given a category of models where objects 
are behaviours and morphisms correspond to extension of behaviours, and given 
a subcategory of observable behaviours, the abstract definition yields a notion of 
bisimilarity for all behaviours with respect to observable behaviours. For exam- 
ple, for rooted labelled transition systems, taking synchronization trees uni into 
which they unfold as their behaviours, and sequences of actions as the observable 
behaviours, we recover the standard strong bisimilarity of Park and Milner 

In order to model concurrency more faithfully several models have been in- 
troduced (see m for a survey) that make explicit the distinction between events 
that can occur concurrently, and those that are causally related. Then a natural 
choice is to replace sequences, i.e., linear orders as the observable behaviours, 
by partial orders of occurrences of events with causality as the ordering rela- 
tion. For example, taking unfoldings of labelled asynchronous transition systems 
into event structures as the behaviours, and labelled partial orders as the obser- 
vations, Joyal et al. H2| obtained from their abstract definition the hereditary 
history preserving bisimilarity (hhp-bisimilarity) , independently introduced and 
studied by Bednarczyk Pj. 

A similar notion of bisimilarity has been studied before, namely history pre- 
serving bisimilarity (hp-bisimilarity), introduced by Rabinovich and Trakhten- 
brot IZH and van Glabbeek and Goltz 0. For the relationship between hp- and 
hhp-bisimilarity see for example puniE] 

One of the important motivations to study partial order based equivalences 
was the discovery that hp-bisimilarity has a rare status of being preserved un- 
der action refinement (2|, an operation important for the top-down design of 
concurrent systems. Bednarczyk Q has extended this result to hhp-bisimilarity. 

There is a natural logical characterization of hhp-bisimilarity checking games 
as shown by Nielsen and Glausen Id: they are characteristic games for an exten- 
sion of modal logic with backwards modalities, interpreted over event structures. 

Hp-bisimilarity has been shown to be decidable for 1-safe Petri nets by 
Vogler and to be DEXP-complete by Jategaonkar, and Meyer HH; let 
us just mention here that 1-safe Petri nets can be regarded as a proper sub- 
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class of finite asynchronous transition systems (see |221 for details), and that 
decidability of hp-bisimilarity can be easily extended to all finite asynchronous 
transition systems using the methods of m- 

Hhp-bisimilarity seems to be only a slight strengthening of hp-bisimilarity 
and hence many attempts have been made to extend the above mentioned algo- 
rithms to the case of hhp-bisimilarity. However, decidability of hhp-bisimilarity 
has remained open, despite several attempts over the years [HI CHI 0 El Froschle 
and Hildebrandt 0 have discovered an infinite hierarchy of bisimilarity no- 
tions refining hp-bisimilarity, and coarser than hhp-bisimilarity, such that hhp- 
bisimilarity is the intersection of all the bisimilarities in the hierarchy. They have 
shown all these bisimilarities to be decidable for 1-safe Petri nets. Froschle 0 
has shown hhp-bisimilarity to be decidable for BPP-processes, a class of infinite 
state systems. 

In this paper, we finally settle the question of decidability of hhp-bisimilarity 
by showing it to be undecidable for finite 1-safe Petri nets. In order to make the 
proof more transparent we first introduce an intermediate problem of domino 
bisimilarity and show its undecidability by a direct reduction from the halting 
problem of 2-counter machines. 



2 Hereditary History Preserving Bisimilarity 



Definition 1 (Labelled asynchronous transition system) 

A labelled asynehronous transition system is a tuple A = (S, s™\ E,—^, L, X, I), 
where S is its set of states, s'“ S S' is the initial state, E is the set of events, 
— > C SxExS is the set of transitions, L is the set of labels, and X : E ^ L is the 
labelling function, and / C E^ is the independence relation which is irrefiexive 
and symmetric. We often write s s', instead of (s, e, s') € — Moreover, the 
following conditions have to be satisfied: 

1. if s s', and s s", then s' = s", 

2. if (e, e') G /, s s', and s' t, then s s", and s" t for some s" G S. 
An asynchronous transition system is coherent if it satisfies one further condition: 



3. if (e, e') G I, s s', and s s", then s' t, and s" 



t for some t G S. 
[Definition E] n 



Winskel and Nielsen ^31 give a thorough survey and establish formal rela- 
tionships between asynchronous transition systems and other models for con- 
currency, such as Petri nets, and event structures. The independence relation is 
meant to model concurrency: independent events can occur concurrently, while 
those that are not independent are causally related or in conflict. 

Let A = {S, s“‘, E, — L, X, I) be a labelled asynchronous transition system. 
A sequence of events e = (ei, C2, . . . , e„) G if" is a run of A if there are states 
si, S2, . . . , Sn-i-i G S, such that si = s™', and for all i G [n] we have Si s^+i. 
We denote the set of runs of A by Runs (A). We extend the labelling function A 
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to runs in the standard way. We say that k S [n] is most recent in e, and we 
denote it by fc S MR(e), if and only if (e^, et) S / for all £ such that k < £ < n. 
Note that if fc G MR(e) then eG k = (ei, . . . , ek-i,ek+i, • ■ • , e„) G Runs(A). 

Definition 2 (Hereditary history preserving bisimulation) 

Let Ai = {Si, Ei, L, Xi, li) for i G {1,2} be labelled asynchronous tran- 
sition systems. A relation B C Runs(Ai) x Runs(A2) is a hereditary history 
preserving (hhp-) bisimulation relating A\ and A2 if the following conditions are 
satisfied: 

1. (£,£) G B, 

and if (eT, 62) G B then Ai(eT) = A2(e2), and: 

2. for all Cl G Ei, if ^ • ei G Runs(Ai), then there exists 62 G E2, such that 
^ • 62 G Runs(A2), and Ai(ei) = A2(c2), and • ei,^ • 62) G B, 

3. for all 62 G E2, if ^ • 62 G Runs(A2), then there exists 61 G Ex, such that 
^ • 6i G Runs(Ai), and Ai(ei) = A2(62), and • 6i,^ • 62) G B, 

4. fc G MR(eT), if and only if fc G MR(e2), 

5. if fc G MR(^) = MR(e2), then (ex G k,e2 G k) G B. [Definitional □ 

Two asynchronous transition systems Ax, and A2 are hereditary history preserv- 
ing (hhp-) bisimilar, if there is an hhp-bisimulation relating them. 

Remark 1 The term hereditary history preserving bisimulation originates from 
the fact that this notion of bisimulation has an alternative definition, which is 
formally a small strengthening of the standard definition of history preserving 
bisimulation 12113, based explicitly on partial order behaviours P E| ■ Note 
that Definition ^ does not mention partial order behaviours explicitly, but they 
are implicit in the notion of most recent occurrences of events. For the proof of 
equivalence of our definition and the other ones see ini 

The main result of this paper is the following theorem proved in section 0 

Theorem 3 (Undecidability of hhp-bisimilarity) 

Hhp-bisimilarity is undecidable for finite labelled asynchronous transition sys- 
tems. 

3 Domino Bisimilarity Is Undecidable 

3.1 Domino Bisimilarity 

Definition 4 (Origin constrained tiling system) 

An origin constrained tiling system T = {D, D™\ {H, H^), iV, V^), L, A) consists 
of a set D of dominoes, its subset C D called the origin constraint, two hor- 
izontal compatibility relations Ef,E[^ C , two vertical compatibility relations 
V, C L)2, a set L of labels, and a labelling function X : D —> L. 

[Definition 0 □ 
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A configuration of T is a triple (d,x,y) G D x N x N, such that ii x = y = 0 
then d G In other words, in the “origin” position {x,y) = (0,0) of the non- 
negative integer grid only dominoes from the origin constraint are allowed. 

Let (d,x,y), and {d',x',y') be configurations ofT such that \x' — x\ + \y' — y\ = 
1, i.e., the positions (x, y), and {x', y') are neighbours in the non-negative integer 
grid. Without loss of generality we may assume that x + y < x' + y' . We say 
that configurations (d,x,y), and {d',x',y') are compatible if either of the two 
conditions below holds: 

— x' = X, and y' = y + 1, and 

if y = 0, then (d, d') G V^, and if y > 0, then (d, d') G V, or 

— x' = X + and y' = y, and 

if a; = 0, then (d, d') G id°, and if a; > 0, then (d, d') G H. 



Definition 5 (Domino bisimulation) 

Let Ti = (^Di, {Hi, id?), {Vi, Vf), Li, A^) for i G {1, 2} be origin constrained 
tiling systems. A relation i? C Di x Z ?2 x N x N is a domino bisimulation relating 
Ti and T 2 , if {d\,d 2 ,x,y) G B implies that Ai(di) = A 2 (d 2 ), and the following 
conditions are satisfied for all i G {1,2}: 

1. for all di G there is d^-i G so that Ai(di) = \ 2 {d 2 ), and 

(di, d 2 , 0, 0) G B, 

2. for all x,y GN, such that {x,y) 7 ^ (0,0), and di G Di, there is da_i G D^-i, 
such that Ai(di) = A 2 (d 2 ), and {d\,d 2 ,x,y) G B, 

3. if (di,d 2 ,x,y) G B, then for all neighbours {x',y') G N x N of (x,y), and 
d'i G Di, if configurations {di,x,y), and {d'i,x',y') of Ti are compatible, then 
there exists dg_j G D^-i, such that Ai(d{) = A 2 (d^, and configurations 
{dz-i,x,y), and {d'^_^,x' ,y') oiT^-i are compatible, and (d(, d^, a:', y') G B. 

[Definitional n 

We say that two tiling systems are domino bisimilar if and only if there is a 
domino bisimulation relating them. 



Theorem 6 (Undecidability of domino bisimilarity) 

Domino bisimilarity is undecidable for origin constrained tiling systems. 

The proof is a reduction from the halting problem for deterministic 2-counter 
machines. For a deterministic 2-counter machine M we define in section TL 31 two 
origin constrained tiling systems Ti, and T 2 , enjoying the following property. 



Proposition 7 Machine M does not halt, if and only if there is a domino bisim- 
ulation relating Ti and T 2 - 



Hereditary History Preserving Bisimilarity Is Undecidable 363 



3.2 Counter Machines 

A 2-counter machine M consists of a finite program with the set L of instruction 
labels, and instructions of the form: 

• Ci : = Ci + 1 ; goto m 

• if Ci = 0 then := + 1; goto m 

else Ci := Ci - 1; goto n 

• halt : 

where i = 1,2; £,m,n G L, and {start, halt} C L. A configuration of M is a 
triple (£,x,y) G L x Nx N, where £ is the label of the current instruction, and x, 
and y are the values stored in counters ci, and C 2 , respectively; we denote the 
set of configurations of M by Confs(M). The semantics of 2-counter machines 
is standard: let Pm C Confs(M) x Confs(M) be the usual one-step derivation 
relation on configurations of M; by we denote the reachability (in at least 
one step) relation for configurations, i.e., the transitive closure of \~m- 

Before we give a reduction from the halting problem of 2-counter machines 
to origin constrained domino bisimilarity let us take a look at the directed graph 
(Confs(M), \~m), with configurations of M as vertices, and edges denoting deriva- 
tion in one step. Since machine M is deterministic, for each configuration there 
is at most one outgoing edge; moreover only halting configurations have no out- 
going edges. It follows that connected components of the graph (Confs(M), \~m) 
are either trees with edges going to the root which is the unique halting configu- 
ration in the component, or have no halting configuration at all. This observation 
implies the following proposition. 

Proposition 8 Let M be a 2-counter machine. The following conditions are 
equivalent: 

1. machine M halts on input (0,0), i.e., (start, 0,0) (halt, a;,?/) for some 
x, y e N, 

2. (start, 0,0) (halt, a;, y) for some x,y G where the relation C 
Confs(M) X Confs(M) is the symmetric and transitive closure of \~m- 



3.3 The Reduction 

Now we go for a proof of Proposition 0 The idea is to design a tiling system 
which “simulates” behaviour of a 2-counter machine. 

Let M be a 2-counter machine. We construct a tiling system Tm with the 
set L of instruction labels of M as the set of dominoes, and the identity function 
on L as the labelling function. Note that this implies that all tuples belonging 
to a domino bisimulation relating copies of Tm are of the form {£, £, x, y), so we 
can identify them with configurations of M, i.e., sometimes we will make no 
distinction between {£,£,x,y) and (£,x,y) G Confs(M) for £ G L. 

We define the horizontal compatibility relations Hm,H% C L x L of the 
tiling system Tm as follows: 
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— {£, m) G Hm if and only if either of the instructions below is an instruction 
of machine M: 

• l\ Cl ;= Cl + 1; goto m 

• to: if Cl = 0 then ci ;= ci + 1; goto n 

else Cl := Cl - 1; goto i 

— {i,rn) G if and only if {£,m) G Hm, or the instruction below is an 

instruction of machine M : 

• l\ if Cl = 0 then ci := ci + 1; goto to 

else Cl := Cl - 1; goto n 

Vertical compatibility relations Vm, and are defined in the same way, with 
Cl instructions replaced with C 2 instructions. We also take = L, i.e., all 
dominoes are allowed in position (0,0). Note that the identity function is a 1-1 
correspondence between configurations of M, and configurations of the tiling 
system Tm', from now on we will hence identify configurations of M and Tm- 
It follows immediately from the construction of Tm, that two configurations 
c,c' G Confs(M) are compatible as configurations of Tm, if and only if c\~m c', 
or d \~m c, i.e., compatibility relation of Tm coincides with the symmetric 
closure of I-m- By we denote the symmetric and transitive closure of the 
compatibility relation of configurations of Tm . The following proposition is then 
straightforward. 

Proposition 9 The two relations and coincide. 

Now we are ready to define the two origin constrained tiling systems Ti, and 
T 2 , postulated in Proposition |7| The idea is to have two independent and slightly 
pruned copies of Tm in T 2 '. one without the initial configuration (start, 0,0), 
and the other without any halting configurations (halt, x, y). The other tiling 
system T\ is going to have three independent copies of Tm'. the two of T 2 , and 
moreover, another full copy of Tm. 

More formally we define D 2 = (L x {1,2}) \ {(halt, 2)}, and H”* = D 2 \ 
{(start, 1)}, and V 2 = {{Vm 0 1) U (Vm ® 2)) n {D 2 x D 2 ), where for a binary 
relation R we define i? 0 z to be the relation { ((a, z), (6, z)) : (a,b) G R} . The 
other compatibility relations V 2 , H 2 , and are defined analogously from the 
respective compatibility relations of Tm- 

The tiling system Ti is obtained from T 2 by adding yet another independent 
copy of Tm, this time a complete one: Hi = H 2 U (L x {3}), and H™' = U 
(L X {3}), and Vi = V 2 U (Vm <8> 3), etc. The labelling functions of Ti, and T 2 are 
defined as Ai((£, z)) = £. 

In order to show Proposition 0 and hence conclude the proof of Theorem El 
it suffices to establish the following two claims. 

Claim 10 If machine M halts on input (0,0), then there is no domino bisimu- 
lation relating Ti and T 2 . 

Claim 11 If machine M does not halt on input (0,0), then there is a domino 
bisimulation relating T\ and T 2 . 



Hereditary History Preserving Bisimilarity Is Undecidable 365 



4 Hhp-Bisimilarity Is Undecidable 

The proof of Theorem El is a reduction from the problem of deciding domino 
bisimilarity for origin constrained tiling systems. A method of encoding a tiling 
system on an infinite grid in the unfolding of a finite asynchronous transition 
system is due to Madhusudan and Thiagarajan El; we use a modified version of 
a gadget invented by them. For each origin constrained tiling system T we define 
an asynchronous transition system A{T), such that the following proposition 



Proposition 12 There is a domino bisimulation relating origin constrained 
tiling systems T\ and T 2 , if and only if there is a hhp-bisimulation relating 
the asynchronous transition systems A{Ti) and A{T 2 ). 

Let T = (D, D°^\ {H, H^), (V, L, A) be an origin constrained tiling system. 
We define the asynchronous transition system A{T). The schematic structure of 
A{T) can be seen in Figured The set of events is defined as: 



The rough idea behind the construction of A(T) is best explained in terms 
of its event structure unfolding m, in which the configurations of x- and y- 
transitions simply represent the grid structure of a tiling system, following m 
Configurations in general consist of such a grid point plus at most two “d”- and 
“d” -events, where the vertical (horizontal) compatibility of the tiling system is 
represented by the independence between a d^- and a (d(i_|_i)j-) event. 

Notation: By abuse of notation we sometimes write dxy or d^y for x,y G N; 
we always mean by that the events dxy or dxy, respectively, where for z € N we 
define z to be z if z < 2, and 2 for even z, and 1 for odd z if z > 2. [Notation] O 
The labelling function replaces dominoes in “d”-, and “d”-events, with their 
labels in the tiling system: 



The states, events, and transitions of A{T) can be read from Figured we briefly 
explain below how to do it. 

There are sixteen states in the bottom layer of the structure in Figure da). 
Let us identify these sixteen states with pairs of numbers shown on the vertical 
macro-arrows originating in these states shown in Figure d^)- Each of these 
macro-arrows denotes a bundle of dy-, and d^-event transitions sticking out of 
the state below, arranged in the fashion shown in Figure dt>)- For each state 
(f,j), and domino d G D, there are dy-, and d^-event transitions sticking out, 
and moreover for each state {i' ,j') from which there is an arrow in Figured^) 



holds. 



Ea(t) = {xi,yi : i G {0, 1, 2, 3} } 

U { dij,dij : i,j G {0,l,2},d G D, and d G if (i,j) = (0,0) }. 
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(b) The fine structure of the upper-right cube of A{T). 



Fig.l. The structure of the asynchronous transition system A{T). 
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to state (i,j), there is a diij'-event transition sticking out of The state 

(0, 0) is exceptional: only dominoes from the origin constraint are allowed 
as events of transitions sticking out of it. It is also the initial state of A{T). 

As can be seen in Figure mb), from both ends of the d^-event transition 
rooted in state (*, j), there is an x^-event transition to the corresponding (bottom, 
or top) {i 0 1, j) state, and an y^-event transition to the corresponding [i,j 0 1) 
state, where i0l = z0lifz<3, and z0l = 2ifz = 3. 

For each di'^'-event transition t sticking out of state and each e G 

D, there can be a pair of transitions which together with t and the ey-event 
transition form a “diamond” of transitions; the events of the transitions lying on 
the opposite sides of the diamond coincide then. This type of transitions is shown 
in Figure ^b) as dotted arrows. The condition for the two transitions closing the 
diamond to exist is that configurations (d, i' ,j') and (e, i' + \i' — i\,j' + \j' — j\) 
of T are compatible, or = (*,j) and e = d. We define the independence 

relation Ia{t) ^ Ea(t) x T'A(t)! to be the symmetric closure of the set: 

[{xi,yj),{xi,dij),{yj,dij) : i, j G {0, . . . , 3}, and d G } U 
{ (dij,dij) : i,jG {0, 1, 2}, and d G D} U 
{ (doj,eij) : j G {0,1,2}, and (d, e) G H° } U 
{ (dy,e(j+i)j) : iG G {0,1,2}, and (d,e) G id } U 

{ (d*o, eii) : i G {0, 1, 2}, and (d, e) GV°] U 

{ (dy ,e,(j+i)) : z G {0, l,2},j G {1,2}, and (d,e)GH}. 

Note that it follows from the above that all diamonds of transitions in A{T) are 

in fact independence diamonds. 

Proof sketch (of Propositional): The idea is to show that every domino bisim- 
ulation for T\ and T 2 gives rise to an hhp-bisimulation for A{Ti) and A(T 2 ), and 
vice versa. First observe, that a run of A{Ti) for i G {1,2} consists of a number 
of occurrences of Xj- and j/fc-events, x and y of them respectively, and a set of 
“d”- and “d”-events, which is of size at most two. In other words, we can map 
runs of A(Ti) into triples (Fi,x,y), where Fi C Ea^Ti) contains at most two 
“d”- and “d”-events, and x,y GN. Define Confs(A(Ti), A(T 2 )) to be the set of 
quadruples (Fi, F 2 , x, y) where Fi’s are as above and x,y GN. Then it is a mat- 
ter of routine verification to see that there exists an hhp-bisimulation between 
A(Ti) and A{T 2 ), if and only if there exists a relation B C Confs(A(Ti), A(T 2 )), 
such that { ( 51 , 62 ) : {Fi, F 2 ,x,y) G B, where ef is mapped to (Fi,x,y) } is an 
hhp-bisimulation relating A(Ti) and A{T 2 ). Hence, in the following we identify 
an hhp-bisimulation with such a relation B. The following claim immediately 
implies Propositional 

Claim 13 1. Let B C Confs(A(Ti), A(T 2 )) be an hhp-bisimulation relating 
A{Ti) and A{T 2 ). Then the set { {d,e,x,y) : [{dxy} , {cxy} , x , y) G H } is a 
domino bisimulation for T\ and T 2 . 
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2. Let B C Confs(Ti,T 2 ) be a domino bisimulation relating T\ and T 2 . Then 
the set { {\^dxy} , {exy} , X , y) : {d,e,x,y) & B ^ can be extended to an 

hhp-bisimulation for A{Ti) and A(T 2 ). 

This concludes the proof of Theorem El [Propositional ■ 

As a corollary of the above proof we get the following strengthening of our main 
theorem. 

Corollary 14 Hhp-bisimilarity is undecidable for finite labelled 1-safe Petri 
nets. 

Proof sketch (of Corollary EJ : An attentive reader might have noticed, that 
the asynchronous transition system A{T) as described in sectionEl and sketched 
in Figure [Q is not coherent, while all asynchronous transition systems derived 
from (1-safe) Petri nets are ESIE)- K turns out, however, that A{T) is not far 
from being coherent: it suffices to close all the diamonds with events dij, and 
Xi in positions (z, j 0 1), and with events dij, and yj in positions (z 0 1, j), for 
i,j S {0,...,3}; note that runs ending at the top of these diamonds are maximal 
runs. This completion of the transition structure of A{T) does not affect the 
arguments used to establish Claim O and hence Theorem El but since it would 
obscure the picture in Figure ^b), we have decided not to draw it there. It is 
laborious but routine to construct a 1-safe Petri net whose derived asynchronous 
transition system is isomorphic to the completion of A{T) mentioned above. 

[Corollary rr^ ■ 
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Abstract. This paper examines a number of variants of the sparse k- 
spanner problem, and presents hardness results concerning their approx- 
imability. Previously, it was known that most fc-spanner problems are 
weakly inapproximable, namely, are NP-hard to approximate with ratio 
0(log n), for every k > 2, and that the unit-length fc-spanner problem for 
constant stretch requirement A: > 5 is strongly inapproximable, namely, 
is NP-hard to approximate with ratio 0(2*°^ "■) 1101 . 

The results of this paper significantly expand the ranges of hardness for 
fc-spanner problems. In general, strong hardness is shown for a number 
of fc-spanner problems, for certain ranges of the stretch requirement k 
depending on the particular variant at hand. The problems studied differ 
by the types of edge weights and lengths used, and include also directed, 
augmentation and client-server variants of the problem. 

The paper also considers fc-spanner problems in which the stretch re- 
quirement k is relaxed (e.g., k = I7(logn)). For these cases, no inapprox- 
imability results were known at all (even for a constant approximation 
ratio) for any spanner problem. Moreover, some versions of the A:-spanner 
problem are known to enjoy the ratio degradation property, namely, their 
complexity decreases exponentially with the inverse of the stretch re- 
quirement. So far, no hardness result existed precluding any fc-spanner 
problem from enjoying this property. This paper establishes strong inap- 
proximability results for the case of relaxed stretch requirement (up to 
k = o{n^), for any 0 < 5 < 1), for a large variety of fc-spanner problems. 
It is also shown that these problems do not enjoy the ratio degradation 
property. 



Classification: Approximation algorithms, Hardness of approximation 

1 Introduction 

1.1 The Sparse Spanner Problem 

The concept of graph spanners has been studied in several recent papers, in 
the context of communication networks, distributed computing, robotics and 
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computational geometry lll4i8iiiiyt2ili24E51 . Consider a connected simple graph 
G = with \V\ = n vertices, where uj : E ^ is a weight function 

on the edge set of the graph, and I : E ^ is a length function on the edge 
set of the graph. For every pair of vertices u,v gV, let P{u, v, G) be the set of 
all simple paths from u to u in G. We define the distance between u and v in 
G to be dist{u,v,G) = minpgp(„ ^ subgraph G' = {V,E') of 

G is a fc — spanner if for every u,v G V, refer to k as the 

stretch factor of Gb 

Spanners for general graphs were first introduced in |2ni, and used to con- 
struct a new type of synchronizer for an asynchronous network. For most ap- 
plications, it is desirable that the spanner be as sparse or as light as possible, 
namely, has few edges or small total weight. This leads to the following problem. 
The cost of a subgraph G' is its weight, uj(G') = J^eeE' 

k-spanner problem is to find a fc-spanner G' = (V, E') with the smallest cost 
w(G'). 

A number of variants of the sparse spanner problem have been considered in 
the literature. The general fc-spanner problem allows arbitrary edge weights and 
lengths. However, the most basic variant of the sparse spanner problem deals 
with the simple unweighted uniform case, where w(e) = /(e) = 1 for every edge 
e G E |‘24l‘25| . We call this variant the unweighted (or basic) fc-spanner problem. 

In-between, one may consider a number of intermediate variants. The first 
is the unit-length fc-spanner problem studied in m- In this case, the weight 
function u may be arbitrary, but the length function I assigns /(e) = 1 to every 
edge e G E. 

An important special case of the unit-length /c-spanner problem is when 
the weight function oj may assign only 0 and 1 values. This problem is called 
the light- edges (LE) /c-spanner problem, and it is equivalent to the /c-spanner 
augmentation problem studied in HI- Intuitively, such a Boolean function oj : 
E — > {0, 1} captures the situation where in addition to the target graph G, 
we are given also an initial partially constructed subnetwork El' , whose edges 
are assumed to be given in advance for free, and it is required to augment the 
subnetwork Ed' into a /e-spanner H for G, where edges not in H' must be “paid 
for” in order to be included in the spanner. We denote the set of zero- weight 
edges by C. The aim is to minimize the number of new edges needed in order to 
obtain a /c-spanner for the given graph. 

A second variant, which can be thought of as the dual of the unit-length k- 
spanner problem, is the unit-weight /c-spanner problem, studied in m In this 
case, the length function / may be arbitrary, but the weight function uj assigns 
co(e) = 1 to every edge e G E. 

Finally, a third variant considered in the literature is the uniform fc-spanner 
mm- In this case, the weight and length functions coincide, i.e. w(e) = /(e) 
for every edge e G E, but that function oj may be arbitrary. 

Any one of the above versions may be also generalized to the client-server (C- 
S) /c-spanner problem, see pi 4| . This is a generalization of the /c-spanner problem 
which distinguishes between the two different roles of edges in the problem, i.e.. 
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the input specifies also a subset C of client edges which have to be spanned, and 
a subset S of server edges which may be used for spanning the client edges. We 
also distinguish three subcases of the C-S fc-spanner problem. The first is the 
disjoint C-S fc-spanner (hereafter, DJ fc-spanner) problem. In this variant the 
client and server sets are disjoint. The second is the all-client C-S fc-spanner 
(hereafter, AC fc-spanner) problem, in which the server set is a subset of the 
client set. Finally, the last variant is the all-server C-S fc-spanner (hereafter, AS 
fc-spanner) problem, in which the client set is a subset of the server set. 

Following m, we define the MIN -REP problem as follows. We are given 
a bipartite graph G(Vi,V 2 ,A^), where Vl and V 2 are each split into a disjoint 
union of r sets; Ij = lJi=i V 2 = Ui=i The sets Ai, Bi all have size N. 

An instance of the problem consists of the 5-tuple (Vi, V 2 , G, {A^}, {5^}). The 
bipartite graph and the partition of V\ and V 2 induce a supergraph B, whose 
vertices are the sets Ai and Bj, where i,j G {l,..,r}. Two sets Ai and Bj are 
adjacent in B iff there exists some Oi G Ai and bj G Bj which are adjacent in 
G. We assume that B is regular and (but its degree, d = deg{B), need not be 
0(1)). A set of vertices O is a REP-cover for B if for each super-edge {Ai, Bj) 
there is a pair ai G Ai and bj G Bj, both belonging to C, such that {ai,bj) G E. 
It is required to select a minimal REP-cover C for B. Note that it is easy to test 
whether a MIN-REP instance admits a REP-cover, just by checking whether 
the all vertex set V\ U V 2 REP-covers all the superedges. Thus we can assume 
without loss of generality, that the given instance admits a REP-cover. 

Consider also the maximization version of this problem, called MAX-REP. 
In this version the REP-cover C may contain at most one vertex from each 
supernode. 

Another close problem is the Label-Cover problem m- In this problem a 
superedge (Ai,Bj) is covered if for every vertex Ui G AiC C there is a vertex 
bj G BjCC such that (a^, bj) G E. It also has the minimization and maximization 
versions called Label — Cover min and Label — Cover max respectively. Note that 
the Label — Cover max problem is equivalent to the MAX-REP problem. 

Consider also the Symmetric Label Cover problem jl2IOj . It is the MIN-REP 
problem restricted to a complete bipartite supergraph. 



1.2 Previous Results 

It is shown in m that the problem of determining, for a given unweighted graph 
G = (V, E) and an integer m, whether there exists a 2-spanner with m or fewer 
edges is NP-complete. This indicates that it is unlikely to find an exact solution 
for the sparsest fc— spanner problem even in the case fc = 2. Consequently, two 
possible remaining courses of action for investigating the problem are establish- 
ing global bounds on the number of edges required for an unweighted fc-spanner 
of various graph classes and devising approximation algorithms for the problem. 

In |24j it is shown that every unweighted n— vertex graph G has a polynomial 
time constructible (4fc -|- 1)— spanner with at most 0{n^~^^^'^) edges. Hence in 
particular, every graph G has an 0(log n)— spanner with 0{n) edges. These 
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results are close to the best possible in general, as implied by the lower bound 
given in M- 

The results of m were improved and generalized in m to the uniform 
case, in which the edges weights and lengths coincide. Specifically, it is shown in 
^ that given an n— vertex graph and an integer fc > 1, there is a polynomially 
constructible {2k + 1)— spanner G' such that \E{G')\ < n ■ [nfc]. Again, this 
result is shown to be asymptotically the best possible. In jjj it was shown that the 
weight of the uniform fc-spanner obtained by the construction of Q is bounded by 

2 + e 

uj(G') = 0{n>‘-^ ■oj{MST)). They also show how the construction can be used to 
provide uniform log^ n-spanners with weight bounded by lo{G') — 0{ui{MST)). 

The algorithms of fn^ provide us with global upper bounds for sparse 
A:— spanners, i.e., general bounds that hold for every graph. However, it may be 
that for specific graphs, considerably sparser spanners exist. Furthermore, the 
upper bounds on sparsity given by these algorithms are small (i.e., close to n) 
only for large values of k. It is therefore interesting to look for approximation 
algorithms, that yield near-optimal bounds for the specific graph at hand. 

In |2nj , a log approximation algorithm was presented for the unweighted 
2-spanner problem. In m the result was extended to an 0(log n)-approximation 
algorithm for the unit-length 2-spanner problem. A log | ^ | -approximation al- 

gorithm for the unit-length C-S 2-spanner problem was presented in Hg. Ap- 
proximation algorithms with ratio log | are given also in |141l for a number 
of other variants of the problem, such as the unit-length 2-spanner augmentation 
problem and directed unit-length 2-spanner (augmentation) problems. 

Also, since any /c-spanner for an n-vertex graph requires at least n — 1 edges, 
the results of mm cited above can be interpreted as providing an 0(n^/^)-ratio 
approximation algorithm for the (unweighted or weighted) uniform /c-spanner 
problem. This implies that once the required stretch guarantee is relaxed, i.e., 
k is allowed to be large, the problem becomes easier to approximate. In partic- 
ular, the unweighted fc-spanner problem admits 0(1) approximation once the 
stretch requirement becomes fc = l7(logn), and the uniform fc-spanner prob- 
lem admits an 0(1) approximation ratio once the stretch requirement becomes 
fc = l7(log^n). We call this property ratio degradation. 

Previously known hardness results for spanner problems were of two types. 
First, it is shown in m that it is NP-hard to approximate the basic unweighted 
fc-spanner problem by an 0(log n) ratio for fc > 2. This type of 0(log n) inapprox- 
imability is henceforth referred to as weak inapproximability. This result applies 
to the majority of the problems studied in this paper. Secondly, it is shown in m 
that the unit-length fc-spanner problem for constant stretch requirement fc > 5 
is hard to approximate with 0(2*°s'"') ratio, unless NP C DT I M E ") 
(or, in short, that approximating it with this ratio is quasi-NP-hard) . This type 
of l7(2'°s ") inapproximability is henceforth referred to as strong inapproxima- 
bility. This result was recently extended to fc > 3 in P2| and independently by 
us, in the current paper, using a different reduction. It is shown in P2| that the 
MIN-REP problem is strongly inapproximable. As recently shown in ^ni) for 
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every 0 < e < 1 it is NP-hard to approximate the Label — CoverMiN problem 
with 0(2^°®'") approximation ratio. As discussed in [HTHI this result applies to 
the Symmetric Label Cover problem and thus to the MIN-REP problem. 

1.3 Summary of Our Results 

Our results significantly expand the ranges of hardness for spanner problems. 
Our main results can be classified as follows (see also Table Pi . 



The Type of 

fc-spanner 

problem 


The Range of the 
Strong Hardness 
Proven in the Paper 


Previously Known 
Hardness Results 


Ratio 

Degradation 

Property 


Uniform 


1< fc < 3 


Weak hardness for fc > 2 ^3 


YES 


Unit- weight 


1< fc < 3 


Weak hardness for fc > 2 [T^ 


YES 


Directed 


3 < fc < o{n^) 


Weak hardness for fc > 2 


NO 


DJ 


3 < fc < o{n") 


No previous results 


NO 


AC 


3 < fc < o{n^) 


Weak hardness for fc > 2 


NO 


C-S 


3 < fc < o{n^) 


Weak hardness for fc > 2 


NO 


Augmentation 


4 < fc < o{n^) 


Weak hardness for fc > 2 [T^ 


NO 


Unit-weight DJ 


0 < fc < oo 


No previous results 


NO 


Unit- weight AS 


0 < fc < 3 


No previous results 


YES 


Unit-length 


3 < fc < o{n^) 


Strong hardness 
for 5 < fc = 0(1) 

Extended to 3 < fc = 0(1) by [r.^j 
independently from us. 


NO 



Table 1. The Summary of Results. 



To begin with, we obtain strong inapproximability results for a number of 
variants of the A:-spanner problem. In particular, we prove the first strong inap- 
proximability results for uniform fc-spanner problems, for the range of stretch 
requirement 1 < A: < 3. Specifically, we prove that 

Theorem 1. For any 0 < e < 1 and 1 < k < 3 + it is quasi-NP-hard to 

approximate the uniform k-spanner problem with ratio 2*°® 

The uniform fc-spanner problem was intensively studied during the last decade 
mm, but as mentioned above, the only previous hardness results were for 
weak inapproximability for the range of fc > 2 m- 

Also, we obtain a strong inapproximability result for the unit-weight fc- 
spanner problem, for the range of stretch requirement 1 < fc < 3. Specifically, 



Theorem 2. For any e > 0 and stretch requirement 1 < fc < 3, it As quasi-NP- 
hard to approximate the unit-weight k-spanner problem with ratio 2^°® 
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This problem was also studied in PQ, but again the only previous hardness 
results were for weak inapproximability, for the range of fc > 2. We note that 
for the range of the stretch requirement 1 < A: < 2, no hardness result at all was 
known for the above two problems (even for a constant ratio). 

Moreover, we obtain a strong inapproximability results for certain versions of 
the unweighted fc-spanner problem as well, for the range of stretch requirement 
A: > 2. In particular, we obtain strong inapproximability results for the directed 
unweighted A:-spanner problem, for the D J unweighted A:-spanner problem and for 
the AC unweighted A:-spanner problem, and thus for the C-S A:-spanner problem 
in general. Specifically, we have 

Theorem 3. For any e > 0 and constant integer k > 3, it is quasi-NP-hard 
to approximate the unweighted directed k-spanner, unweighted DJ k-spanner or 
unweighted AC k-spanner problems with ratio 2*°s 

Note that for the DJ unweighted A:-spanner problem no hardness result was 
known at all, since no reduction from the unweighted A;-spanner problem to the 
DJ unweighted A;-spanner problem is known. For the AC unweighted A:-spanner 
problem, the only hardness result known is weak inapproximability for k > 2 m- 
For k = 2 a log | v{c ) \ -^approximation algorithm for both problems is provided in 

Ql’ 

The directed unweighted A;-spanner problem was presented already in the 
first paper defining the notion of A;-spanner |24| . but the only hardness result 
known for the problem is weak inapproximability for k > 2m- We significantly 
improve the threshold showing strong inapproximability for k > 2 (which is 
the best possible, since for k = 2 the problem enjoys a log |^-approximation 
algorithm m)- 

We also obtain strong inapproximability results for the A:-spanner augmen- 
tation problem for the range of stretch requirement A: > 3. 



Theorem 4. For any e > 0 and constant integer k > A, it is quasi-NP-hard to 
approximate the unit-length k-spanner augmentation problem with ratio 2^°® 



The only previously known lower bound for the approximability of the problem 
was C(logn) for k>2 (see m)- 

Our second contribution involves hardness results for relaxed stretch require- 
ments and ratio degradation. All the previous hardness results HM have shown 
hardness for A:-spanner problems only for certain constant values of the stretch 
requirement k. For relaxed stretch requirement k = l7(logn) no hardness re- 
sults (even for a constant approximation ratio) were known for any spanner 
problem. Furthermore, the uniform A:-spanner problem and the unweighted k- 
spanner problem are known |24lll'/l2Tij to enjoy the ratio degradation property, 
i.e., their complexity decreases exponentially with the inverse of the stretch re- 
quirement (and, in particular, admit O(logn) and 0(1) approximation ratios, 
respectively, whenever the stretch requirement becomes J7(logn)). No hardness 
result existed precluding some A:-spanner problem from enjoying this property. 
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In this paper we establish the first strong inapproximability results for the 
relaxed stretch requirement k = o{n^), for any 0 < <5 < 1 for a number of spanner 
problems. Specifically we have 

Theorem 5. For any 0 < e, 5 < 1 and 3 < fc = o(n^) it is quasi-NP-hard 
to approximate the unit-length k-spanner, unweighted DJ k-spanner, unweighted 
AC k-spanner or unweighted directed k-spanner problems with ratio 2'°® More- 
over, for any 0 < e, J < 1 and A < k = o(n^) it is quasi-NP-hard to approximate 
the unit-length k-spanner augmentation problem with ratio 

Also we show that all these problems do not enjoy the ratio degradation prop- 
erty. These problems include the directed unweighted A:-spanner problem, the DJ 
unweighted fc-spanner problem (and thus C-S unweighted /c-spanner problem), 
the AC unweighted fc-spanner problem, the unit-length fc-spanner problem and 
the fc-spanner augmentation problem. Specifically, 

Theorem 6. For any a, /? > 0 and 0 < ei, C 2 < 1 there is no algorithm A{G, k) 
that approximates the unit-length k-spanner or unit-length k-spanner augmenta- 
tion problems on every n-vertex graph G and for every sufficiently large k with 
ratio 0{k°‘ ■ w*F^) or 0(2*°s ^ ^ ■ w* *= ). 



Theorem 7. For any o,/3 > 0 and 0 < ei,C 2 < 1, and for any problem 77 
from among the unit-length k-spanner, unit-length k-spanner augmentation, DJ 
k-spanner, AC k-spanner, C-S k-spanner or directed k-spanner problems, there 
is no algorithm A(G, k) that approximates 77 on every n-vertex graph G and for 
every sufficiently large k with ratio 0[k°‘ ■ n'^) or 0(2^°® ^ 

For weighted fc-spanner problems, another potential direction one may con- 
sider is to look for an approximation algorithm whose ratio guarantee is some 
function of the optimal weight w* (and not only the number of vertices n). We 
show that the unit-length fc-spanner problem, fc-spanner augmentation prob- 
lem and uniform fc-spanner problem cannot be approximated neither with ratio 
2 ^°s n nQj. with ratio 2^°® (within the corresponding ranges), unless NP C 

DTIME{nP°^Aog ny 

Theorem 8. For any 0 < e, <5 < 1 and 3 < fc = o(n^) it is quasi-NP-hard to 
approximate the unit-length k-spanner problem with ratio 2'°® " or 2^°® . 



Theorem 9. For any 0 < e < 1 and 1 < k < S it is quasi-NP-hard to approxi- 
mate the uniform k-spanner problem with ratio 2*°s ” or 2*°® . 

We obtain also several secondary results. First, we show a strong inapprox- 
imability result for the Red-Blue problem. This result was obtained indepen- 
dently by . Also we present several reductions from some fc-spanner problems 
to the Red-Blue problem. Specifically, we prove that 
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Theorem 10. For any 0 < e, (5 < 1, the RB problem is quasi-NP-hard to ap- 
proximate with ratio 2*°s even when restricted to exactly one blue element and 
two red elements in each set, |5| and |i?| are both bounded by and \B\ is 

bounded by n. 

Theorem 11. The general k-spanner problem such that the ratio between the 
maxeg£:{^(e)} and the mineg£:{^(e)} is bounded by a constant is reducible to the 
RB problem. 

Finally, in addition to our strong inapproximability result concerning the 
unit-weight fc-spanner problem we obtain strong inapproximability results for 
the widest possible range of the stretch requirement 0 < k < oo for the unit- 
weight DJ fc-spanner problem. 

Theorem 12. For any 0 < fc < oo and 0 < e < 1 it is quasi-NP-hard to 
approximate the DJ unit-weight k-spanner problem with ratio ”, even when 
restricted to two possible edge lengths. 

Also we obtain for the unit- weight AS fc-spanner problem a strong inapprox- 
imability result for a wider range than for the unit-weight fc-spanner problem, 
specifically, for 0 < fc < 3. 

Theorem 13. For any 0 < fc < 3 and 0 < e < 1, it is quasi-NP-hard to 
approximate the unit-weight AS k-spanner problem with ratio 2'°® ”, even when 
restricted to two possible edge lengths. 

All of our results are established by reductions from the MIN-REP problem 
of pni- Specifically, we discuss the relationship between the MIN-REP problem 
and the Label-Cover problem (cf. Moreover, we show that the MIN-REP 

problem and the Label-Cover problem admit a -\/n-approximation ratio and that 
the MIN-REP and the Label-Cover problems restricted to the cases where the 
girth of the induced supergraph is greater than t, admit an n~ approximation 
ratio. In particular, it follows that the MIN-REP and the Label-Cover problems 
with girth greater than log*^ n (for some constant e > 0) are not strongly inap- 
proximable, i.e., admit an 0(2*°s ”)-approximation ratio, for some 0 < e' < 1. 
Specifically, we prove 

Theorem 14. 1. MIN-REP and Label — Cover min with girth{H) > t admit 
an 0{n~)- approximation algorithm. 

2. MIN-REP and Label — Cover min with girth{H) = O(logn) admit an 0(1) 
approximation ratio. 

3, For any 0 < e < 1 there exists 0 < e' < 1 such that MIN-REP and 
Label — CovevMiN with girth{TL) = 0(log*^ n) admit an 0(2*°s approxi- 
mation algorithm. 

Corollary 1. The MIN-REP and Label — CoverMiN problems admit a yd- 
approximation algorithm. 
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Very recently we resolved the main question left open in this paper, concern- 
ing the inapproximability of the basic fc-spanner problem. Specifically, we have 
proved that the basic fc-spanner problem is strongly inapproximable for any k > S 
eg. In particular, this and the results presented here imply that the uniform 
fc-spanner problem and the unit-length fc-spanner problem are strongly inap- 
proximable for any constant value of the stretch requirement k > 1 and that the 
A:-spanner augmentation problem is strongly inapproximable for 3 < k = o(n^). 
We have also shown there that MIN-REP and the Label — Cover min problems 
with girth greater than log^ n (for some constant 0 < /r < 1) are inapproximable 
within a ratio of I7(2*°s "), for any 0 < e < 1 — 

In the remainder of the paper, we illustrate our proof techniques by establish- 
ing one part of Theorem H The remaining proofs are deferred to the full paper. 
One can also find most of them in the technical report version of the paper m 

2 The Uniform Spanner Problem 

Our proof of Theorem 0 is comprised of four parts. In the first part we prove 
the result for the stretch requirement 1 < /c < 2, in the second we extend it to 
1 < /c < 3, in the third we prove it for fc = 3 and finally in the fourth part we 
slightly improve the result, by proving it for k = 3 + In this section we 

sketch the proof of the first part of the theorem, i.e., for A: = I -|- e, 0 < e < I. 
This is done by a reduction from MIN-REP problem. 

The reduction is as follows. Fix 0 < e < 1 and let 

X = max{e“^, |E| -I- 2rA^^ -I- 1} . 

Given an instance of MIN-REP problem (Vi, V2,E, {Ai}, {Bi}), we construct an 
instance G = (V,E) of the general (1 -I- e)-spanner problem as follows. 

r r 

V = [j A,\j\J B,\J {si,UYi=i 

i—\ i—1 

1 1 < * < < j < ^} , 

E = EUD\JEsA^EtB^En , 

with 

r r 

D=\J {(a(,af) I a(,af G AJ U |J I G B,} , 

i=l j=l 

r 

EsA = [J {(sj,a-) I a- € Ai} , 

r 

EtB = [J {{b[,ti) I G Bi} , 

En = {{si,tj) I (Ai,Bj) G Ti.} . 
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Fig. 1. Dotted lines represent unit weight edges, solid lines represent edges of 
weight X, dashed lines represent edges of weight (2x +!)/(! + e). 



The weight assignment on the edges is as follows (see Figure QJ. 

r 1, ee EUD 

uj{e) = < a;, e G Esa U Ets 

y {2x + 1)/ (1 + e), e G E-h 

Let us briefly provide some intuition for the way the reduction operates. The 
vertices Si and tj represent the supernodes Ai and Ej, respectively. The edges 
of if-H, between Si and tj, correspond exactly to the edges of the supergraph 
H. We assign these edges the biggest weight in the construction (specifically, 
(2x + 1)/(1 + e)) in order to prevent the optimal spanner from using them. On 
the other hand, the spanner will use intensively the edges of E, from the original 
graph. In the MIN-REP problem we pay only for the vertices of the original 
graph which are taken into the REP-cover. For this reason, we assign the edges 
of E the lowest weight in the construction (the unit weight). Finally, taking any 
edge of Esa, connecting Si and some vertex of Ai (or any edge of Ets, connecting 
tj and some vertex of Bj) into the spanner represents taking this vertex from Ai 
(or Bj) into the REP-cover. Since we are interested in minimizing the number of 
vertices taken into the REP-cover, we assign these edges a weight that is much 
bigger than the weight of edges of E, for which we are interested not to pay at all, 
but significantly smaller than the weight of edges of E-jj, which we try to make 
too “expensive” for the spanner to use. Hence, the weight of any near-optimal 
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spanner approximately equals the number of Esa^Eib edges it uses, multiplied 
by X. 

Given the spanner H for the instance G of weight uj{H), we construct a 
MIN-REP cover C of size approximately uj{H)/x in two stages. First, for every 
spanner H we construct a spanner H' of approximately the same size that does 
not use edges. We call such a spanner a proper spanner. Next, from E[' 

we build a MIN-REP cover of size approximately uj{H')/x. 

Lemma 1. For every {1 +e)~ spanner H , there is a polynomial time eonstructible 
proper (1 -|- e)-spanner FI' such that uj{F{') < (1 -I- e) • oj{F[). 

Given a proper spanner FI' we construct a REP-cover C of size close to 
uj{H')/x by letting 



C = {a{,bf I (s„a'),(5-,t,)ei7'} . (1) 

Lemma 2. C defined by m is a REP-eover for G, and \G\ < (l + e)u}(H)/x . 

Gonversely, given a REP-cover C we construct a (1 -I- e)-spanner H for G by 
letting 

H = EU DU {is^,al),{bf,tj) € E \ albip € C} . 

Lemma 3. H is a {1 + e) -spanner for G, and oj{H) < 2x ■ \G\ . 

This establishes Theorem ^ for fc = 1 -I- e, where 0 < e < 1. 
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Abstract. The traveling salesman problem (TSP) is one of the hardest 
optimization problems in NPO because it does not admit any polyno- 
mial time approximation algorithm (unless P = NP). On the other hand 
we have a polynomial time approximation scheme (PTAS) for the Eu- 
clidean TSP and the |-approximation algorithm of Christofides for TSP 
instances satisfying the triangle inequality. The main contributions of 
this paper are the following: 

(i) We essentially modify the method of Engebretsen in order to 

get a lower bound of ||^ — e on the polynomial-time approximability 
of the metric TSP for any e > 0. This is an improvement over the 
lower bound of |||^ — e in |KnlW| . Using this approach we moreover 
prove a lower bound Sg on the approximability of A/j-TSP for | < 
/3 < 1, where A/j-TSP is a subproblem of the TSP whose input 
instances satisfy the /3-sharpened triangle inequality cost{{u , «}) ^ 
f3 ■ {cost{{u, x}) + cost{{x,v})) for all vertices u,v,x. 

(ii) We present three different methods for the design of polynomial-time 
approximation algorithms for A/j-TSP with | < /3 < 1, where the 
approximation ratio lies between 1 and |, depending on (3. 



Keywords: Approximation algorithms. Traveling Salesman Problem 

1 Introduction 

The traveling salesman problem (TSP) is one of the hardest optimization prob- 
lems in NPO. It is considered to be intractabl^l because it does not admit any 
polynomial time p(n)-approximation algorithm for any polynomial p in the input 
size n. On the other hand there are large subclasses of input instances of the 

^ This holds even if one considers the current view of tractability exchanging exact 
solutions for their approximations and determinism for randomization. 
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TSP that admit polynomial-time approximation algorithms with a reasonable 
approximation ratio. The Euclidean TSP (also called geometric TSP) admits 
even a polynomial time approximation scheme and the Z\-TSP (TSP with tri- 
angle inequality, also called metric TSP) can be approximated by Christofides 
algorithm with an approximation ratio of Generally, the recent research has 
shown that the ’’relation” of an input instance of the TSP to the triangle in- 
equality may be essential for estimating the hardness of this particular input 
instance. We say, for every /3 ^ i, that an input instance of the general TSP 
satisfies the /3-triangle inequality if 

cost({u, u}) ^ (3 ■ {cost{{u, a;}) -I- cost({x, u})) 

for all vertices u, v, x. By Z\^-TSP we denote the TSP whose input instances 
satisfy the /3-triangle inequality. If /3 > 1 then we speak about relaxed triangle 
inequality and if /3 < 1 we speak about sharpened triangle inequality. 

Considering the relaxed triangle inequality in [AH95) IB( IHHKSU}in| it 
has been proved that 

(i) Z\^-TSP can be approximated in polynomial time with approximation ratio 
min{4/3, |/3^}, and 

(ii) unless P — NP, Z\^-TSP cannot be approximated with approximation ratio 
1 -I- e • /3 for some £ > 0. 

Thus, these results enable us to partition all input instances of TSP into infinitely 
many classes according to their approximability, and the membership to a class 
can be efficiently decided. 

In this paper we consider for the first time the sharpened triangle inequality, 
i.e. with /3 < 1. This does not seem to be as natural as to consider the Zl/j-TSP 
for /3 ^ 1, but there are some good reasons to do so. In what follows we list the 
two main reasons. 

1. We have PTASs for the geometrical TSP [Art) ft lArDTI I.VIiDfil . but none 
of them is practical because to achieve an approximation ratio 1 -I- £ one 
needs 0(n^°'^ ) time (in the randomized case 0{n ■ (log 2 n)^°'^ )). So, 

from the user’s point of view the best algorithm for the geometrical TSP 
is the Christofides algorithm for Z\-TSP with its approximation ratio of |. 
Thus, it is of interest to search for nontrivial subclasses of the input instances 
of A-TSP (or even of the geometrical TSP) for which a better approxima- 
tion ratio than | is achievable. The sharpened triangle inequality has a nice 
geometrical interpretation: The direct connection must be shorter than any 
connection via a third vertex. So, the problem instances of Z\^-TSP for /3 < 1 
have the property that no vertex (point of the plane) lies on or almost on 
the direct connection between another two vertices (points). 

^ In what follows an o-approximation algorithm for a minimization problem is any 
algorithm that produces feasible solutions whose costs divided by the cost of optimal 
solutions is at most a. 
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2. The problem Z\j.-TSP is simple because all edges must have the same cost. 
How hard is then the problem for values of [3 that are very close to (but 
different from) i? Are these problems NP-hard or even APX-hard? If the 
answers were positive, one could consider to partition the set of input in- 
stances of A-TSP into an infinite spectrum of Z\^-TSP instances for (3 G 
(^,1] according to the polynomial-time achievable approximation ratio. This 
could provide a similar picture as partitioning the input instances of the 
general TSP into classes according to the relaxed triangle inequality in 

Esini Em EHKsnns ■ 



Our first and main result is the improvement of the explicit lower bound on 
the polynomial-time approximability from |||i — e |t;n99] to — e for any 
e > 0. Our proof is based on the idea of Engebretsen, who reduced the LinEq2- 
2(3) problem to the TSP subproblem with edge costs 1 and 2 only. We modify this 
proof technique by considering the reduction to input instances of Z\-TSP whose 
edge costs are from {1,2,3}. This modification requires some crucial changes 
in the construction of Engebretsen as well as some essentially new technical 
considerations. We apply the obtained lower bound on zi-TSP to get explicit 
lower bounds on the approximability of Z\/ 3 -TSP for every ( 3 , \ < (3 < 1 . So, 
the answer to our motivation (2) for the study of A^-TSP problems for /3 < 1 is 
that these problems are APX-hard for every /3 > i. 

In the second part of our paper, we present three different approaches for 
investigating Z\^-TSP with (3 < 1 . First, we analyze the behavior of the Christo- 

fides algorithm and prove that it is a -I- 3^2 i ) -approximation algorithm 

for Z\^-TSP with i ^ /3 < 1. Secondly we present a general idea how to modify 
every a-approximation algorithm for Z\-TSP to achieve an )- 

approximation algorithm for A^-TSP. The last approach designs a new special 
(1 -I- I • -approximation algorithm for A^-TSP. This algorithm is the best 

one for ^ ^ (3 ^ |, the first two approaches dominate for (3 > |. In this way 
together with the lower bounds we achieve our aim to partition the set of input 
instances of A-TSP into an infinite spectrum according to the polynomial-time 
approximability. Moreover, our algorithms are efficient (the first runs in time 
0(n2-5(logn)i-5) HHaH), and so we have practical solutions for large subclasses 
of A-TSP with an approximation ratio better than |. 

This paper is organized as follows. In Section E| we present our improved lower 
bound on the polynomial-time approximability of A-TSP. Section 0 contains 
some elementary fundamental observations about A^-TSP that are useful for all 
three approaches and the claim that A^-TSP is APX-complete for every /3 > i. 
The subsequent sections present the three approaches in the above mentioned 
order. 
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2 An Improved Lower Bound for the TSP with Triangle 
Inequality 

In this section we will present an improved lower bound on the approximation 
ratio of ATSP. We will show that it is NP-hard to approximate Z\-TSP within 
11^ — £ for any £ > 0. The best previously known lower bound is that of En- 
gebretsen |En99| which states that it is NP-hard to approximate Z\-TSP within 
— £ for any £ > 0. 

We will prove our lower bound even for the following special case of Z\-TSP: 
Let 2,3}-TSP denote the special case of Z\-TSP such that all edge costs are 
from {1, 2, 3}. 

To obtain the lower bound we use a reduction from the LinEq2-2(3) problem 
that is defined as follows: 

Definition 1. LinEq2-2(3) is the following maximization problem. Given a sys- 
tem of linear equations mod 2 with exactly two variables in each equation and 
exactly three occurrences of each variable, maximize the number of satisfied equa- 
tions. 

The idea of our proof is an extension of the proof idea in [IEn99j where a 
reduction from LinEq2-2(3) to Z\{i_2}-TSP (i.e. TSP with edge costs from {1, 2}) 
was used. 

To this end, we introduce the notion of gap problems which means to cut 
out of the original problems a subset of input instances such that there is a 
certain gap in the allowed quality of the optimal solutions. Then the existence 
of an NP-hard gap problem with gap (a, (3) implies that no polynomial time ^- 
approximation algorithms exists for the underlying optimization problem, unless 
P=NP (see e.g. jHo9fij. chapter 10, or ;MPS98j. chapter 8). 

By (a, /3)-LinEq2-2(3), we describe the decision problem which has as input 
only instances of LinEq2-2(3) where either at least a fraction a or at most a 
fraction (3 of the given equations can be satisfied at the same time, for some 
0 ^ /3 < a ^ 1. 

Similarly, we define (a, /3)-Z\{i_2,3}-TSP. Here, only instances are admitted 
where the number of vertices divided by the cost of a cheapest tour (between | 
and 1 in general) is either above a or below [3. 

Please note that we have normalized the considered solution qualities to 
lie between 0 and 1, a larger number always signaling a better solution for 
maximization as well as for minimization problems. This normalization could 
be omitted in principle, but we find it more convenient for comparing different 
problems and instances of different size. 

Below, we will give a reduction from ^-LinEq2-2(3) to 

( 7626-ei > 7624 +e 2 )~^{L2,3}-TSP (for any small £i,£2 > 0, and suitable £i,£2). 

This reduction together with an inapproximability result by Berman and 
Karpinski |BK98] (the NP-hardness of ^^^^g^^)“LiuEq2-2(3) in our 

terms) implies the following theorem. 
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Theorem 1. For any e > 0, approximating A^i 2 ^ 3 ,}-TSP within — e is 
NP-hard. 

Corollary 1. For any e > 0, approximating A-TSP within — e is NP-hard. 

Due to space limitations we are unable to present the full proof of Theorem 
0in this extended abstract, but we will give a sketch of the proof here. 

Sketch of the proof of Theorem Q|, For a given LinEq2-2(3) instance, we first 
construct an undirected graph Go which consists of 68n + 1 vertices, if the given 
LinEq2-2(3) instance has 3n equations and 2n variables. Then we construct a 
‘^{i, 2,3}-TSP instance G from Go by setting the edge costs for all edges in Go 
to one and setting all other edge costs to the maximal possible value from {2, 3} 
such that the triangle inequality is still satisfied. This means that the cost of an 
edge (u,v) is min{3, dfsto(u, 'c)} where disto{u,v) is the distance of u and v in 
Go- 

Then we will show that an optimal Hamiltonian tour in G uses 2e edges with 
cost 7^ 1, iff an optimal assignment in the LinEq2-2(3) instance satisfies all but 
e equations. 

The main technical difficulty of the proof lies in additionally showing that 
all these expensive edges must have a cost of 3, that is they connect vertices of a 
distance at least 3 in Go . More precisely, we show that any tour can be modified 
without increasing the cost in such a way that the resulting tour has the desired 
property. 

After all, a LinEq2-2(3) instance with 336n equations is converted into a A- 
TSP instance with 7616n+l vertices such that the following holds. If (332 — 
equations can be satisfied at the same time, there exists a Hamiltonian tour of 
cost at most (7624 + £2)71 + 1, and if at most (331 + e[)n equations can be 
satisfied at the same time, there exists no Hamiltonian tour of cost less than 
(7626-£i)n + l. 

The general picture of the construction, that is, the structure of Go, is shown 
in Figure Q 




Fig. 1. The complete construction for a sample instance with 9 equations and 6 
variables. The first variable cluster is drawn with bold lines. 



For each equation, we construct an equation gadget which is one of the grey 
boxes in the upper part of the picture. A variable cluster consists of some vertices 
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which are also part of equation gadgets together with some additional vertices. 
The bold edges in FigureQ]mark a single variable cluster. Finally, all these parts 
are connected in a large circle as shown in Figure Q 

There are two types of equation gadgets corresponding to the two types of 
equations, being a; + y = 0, and x + y = 1 respectively. For the first type, an 
equation gadget of type 0 is shown in Figure 0(a), and Figure 0(b) depicts an 
equation gadget of type 1. 



a 



d 





d e 



/ a 



(b) 



Fig. 2. An equation gadget of type 0 is shown in (a), a gadget of type 1 in (b). 



Please note, that vertices a and b are used in common by different gadgets, 
that is, vertex b of one gadget is vertex a of the next one. 

By definition, a LinEq2-2(3) instance has 2m variables and 3m equations, 
for some m. Counting the number of vertices used in the equation gadgets and 
variable clusters, we obtain a graph having 68m + 1 vertices. 

The LinEq2-2(3) instances of use only numbers m = 112n, for some 

natural number n, such that out of the resulting 336n equations either at least 
(332 — or at most (331 + e[)n equations can be satisfied at the same time. 

We will show that in our Z\-TSP instances, e non-satisfied equations translate 
into cost \V\ + 2e for an optimal tour. Thus, starting from the LinEq2-2(3) 
instances of [BK98j . we obtain Z\-TSP instances having 68-(112n) + l = 7616n+l 
vertices. Hence, the cost of an optimal tour will be either at most 7616n + 1 + 
2(4 + £^71 = 7624 + £2 or at least 7616n + 1 + 2(5 — e[)n = 7625 — £ 1 . 

The main technical part of the proof is to show the following claim. 

Claim. In G, the cost of an optimal Hamiltonian tour is |P| + 2e iff in the 
underlying LinEq2-2(3) instance at most all but e equations can be satisfied at 
the same time. 

One direction of this claim is straightforward. Starting from an assignment 
to the variables of the LinEq2-2(3) instance, we obtain a tour of the claimed 
maximal cost as follows. It traverses the graph of Figure 0 essentially along the 
outer cycle, taking some detours through variable clusters and equation gadgets. 
If a variable is set to 1 in the given assignment, the tour uses the edges of the 
variable cluster which visit all three gadgets of those equations where the variable 
occurs. Otherwise it uses the the shortcut below in Figure ^ 

Thus, for a satisfied equation of type a; + y = 1, in the corresponding gadget 
exactly one of the vertex sequences (c, d, e) resp. (/, g, h) is visited as part of a 
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Fig. 3. The traversal of equation gadgets. 



variable cluster. The rest of that gadget can be traversed as shown in FigureOl(a). 
Similarly, in a gadget for an equation of type x + y = 0, both or none of the 
mentioned sequences are traversed as part of a variable cluster. Thus, it admits 
a traversal as depicted in Figure 0(b) and (c). 

It remains to include the vertices of equation gadgets for unsatisfied variables 
in the tour. Here, the traversal of the variable cluster implies that, for a gadget 
of type 1, both or none of the sequences (c, d, e) resp. (/, g, h) are left open. And 
for a gadget of type 0, exactly one of those sequences remains open. In this case, 
we add pieces to the tour as depicted in Figure 0 (d)-(f). 

Only in the last step, we have used edges which are not part of Go (the 
dashed edges in Figure 0. These connect vertices having distance at least 3 in 
Go, thus they have cost 3. All other edges are part of Gq, i.e., they have cost 1 
in G. 

Overall, an assignment leaving e equations unsatisfied results in a tour of 
cost \V\ + 2e. 

For the opposite direction, we have to show that an arbitrary Hamiltonian 
tour through G can be modified, without increasing the cost, into a tour which 
has the structure of a tour constructed from an assignment as above. Then, an 
assignment can be inferred from that tour having the claimed quality in a direct 
reversal of the above procedure. 

The mentioned transformation of an arbitrary tour consists of a lengthy 
procedure with many case distinctions which makes up the most part of the full 
proof. In the following, we give an overview of the transformations which are 
needed, and we demonstrate the method in one exemplary case. 

In the sequel, we use the notion of an endpoint in the considered tour. An 
endpoint is a vertex with the property that at least one of its two incident edges 
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used in the tour does not belong to Gq. A connector of a gadget denotes the 
pairs of edges leaving the gadget from vertices c and e, or from vertices / and 
h respectively. It is semi- traversed by a tour, if exactly one of its edges is used 
by the tour, and it is traversed if both belong to the tour. Distances will always 
refer to Gq. 

Now, the task at hand is to modify a given tour, without increasing the cost, 
in a way such that 

1. the distance between endpoints will always be at least 3, 

2. there will be no semitraversed connectors, and 

3. in the equation gadgets there are always 0 or 2 endpoints. 

In the resulting tour, an equation gadget will have no endpoints iff the number 
of traversed connectors equals the gadget type (modulo 2). 

We will illustrate the method by sketching the proof of the following claim. 

Claim. Assume a tour having no semi-traversed connectors and a gadget of type 
0 where the tour traverses exactly one connector. Then the tour can be modified 
without increasing the cost, and without changing it on the connector edges, 
in such a way that there are exactly two endpoints at distance ^ 3 in that 
gadget, and it is impossible to modify it such that there are only two endpoints 
at distance 2. 

Roughly speaking, this states that an optimal tour has to look like FigureEl(f) 
in a gadget of an unsatisfied equation of type 0. 

To show this claim, first one can easily verify that the fixed use of connector 
edges implies the use of at least two endpoints. Then, the essential insight is 
that there is no way of managing it with two endpoints of distance 2. 

Assuming the contrary, we have to look for a pair of vertices at distance 
2 which may be used as endpoints. Distance 2 implies that there is a vertex 
adjacent to both of these in Gq. That vertex cannot be of degree 2 in Go since 
then the tour could be simply improved to have no endpoints in the gadget at 
all. It remains to check all pairs of vertices where the common neighbor is of 
degree 3. That means to check whether we can get an Hamiltonian path from 
vertex a to 5 in the gadget of type 0 without using the left connector (i.e. vertices 
c, d, e) by adding exactly one of the edges shown in Figure El It turns out that 




Fig. 4. The traversal of equation gadgets. 



this is impossible. 



□ 
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3 Fundamental Observations about the TSP with 
Sharpened Triangle Inequality 



First, we observe that also the Zi/j-TSP with i < /3 < 1 is a hard optimization 
problem. 

Lemma 1. For any e > 0 and for any ^ < /3 < 1 it is NP-hard to approximate 
the Ap-TSP within 2 ~+ + 4 p ~ 

Proof. The claim can be shown analogously to the proof of Theorem Q using 
the edge costs {1, 2/3, 2/3^ + /3} instead of {1, 2, 3}. □ 

In the sequel, we will use the following notations. Let G = {V, E) be a com- 
plete graph with a cost function cost : E — > IR^°. Then Cmin = niineg^; cost{e) 
and Cniax = niaxeg£; cost{e). Furthermore, iLopt will denote an optimal Hamilto- 
nian tour in G. 



Lemma 2 . Let ^ ^ /3 < 1, let (G,cost) be a problem instance of Ap-TSP. 



(a) For any two edges 61,62 with a common endpoint cost{e\) ^ • cost{e 2 ). 

(b) < 






2 / 3 ^ 

1-/3 



Proof, (a): For any triangle consisting of the edges 61 , 62,63 the /3-triangle in- 
equality implies cost{ei) ^ l3{cost{e2) + cost{e^)) and cost{ez) ^ fi{cost{ei) + 
cost{e 2 )) and thus cost{ei) ^ l3{cost{e2) + /3(cost(ei) -I- cost{e 2 ))). This implies 
cost{ei) ^ • cost{e 2 ) = • cost{e 2 ). 

(b): Let {a, b} be an edge with cost Cmin and let {c, d} be an edge with cost 
Cmax- If these edges have a common endpoint, the claim follows immediately from 
(a) since ^ T^’ these edges do not have a common endpoint, we know 
from (a) that cost{{a, c}), cost{{a, d}) ^ ■ Cmin- With the triangle inequality 

we have c^ax ^ /3(cost({a, c}) -I- cost{{a, d})) ^ /3 • 2 • ■ Cmin = y^ • Cmin- □ 

Part (b) of Lemma 121 directly implies that, for every Hamiltonian tour P[ of 
an input instance for Z\^-TSP, 



cost{F[) 2/3^ 
cost(iLopt) ''1-/3' 

The following three approaches essentially improve over this trivial approxima- 
tion ratio. 

4 Using /1-TSP Approximation Algorithms for A^-TSP 

In the first part of this section, we analyze the approximation ratio of the 
Christofides algorithm for Z\^-TSP. The main idea is that we can improve the 
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I approximation ratio due to shortening of paths in two different parts of the 
Christofides algorithm, namely by building the matching and by constructing the 
Hamiltonian tour from the Eulerian tour. In fact we statistically prove that we 
save some non-negligible part of the costs of a ’’lot” of edges of the constructed 
Eulerian graph. 



Theorem 2. For | ^ /3 < 1 the Christofides algorithm is a (1 + 
approximation algorithm for Aj^-TSP. 



We will omit the proof of Theorem 0 in this extended abstract. 

Observe, that Sjj = equal to 0 if /3 = ^ (i. e. if all edges have 

the same cost), and that 5p = ^ for /3 = 1. Thus, 5p continuously converges to 
0 with the degree of the sharpening of the triangle inequality. 

Now, we show how to modify any a-approximation algorithm for Z\-TSP to 
obtain a Ja^/j-approximation algorithm for Z\^-TSP with < a for every [3 < 
1 . The advantage of this approach is that any improvement on the approximation 
of Z\-TSP automatically results in an improvement of the approximation ratio 
for z4^-TSP. The idea of this approach is to reduce an input instance of Z\/ 3 -TSP 
to an input instance of Zi-TSP by subtracting a suitable cost from all edges. 



Theorem 3. Let A be an approximation algorithm for A-TSP with approxima- 
tion ratio a, and let ^ < (3 < 1. Then A is an approximation algorithm for 

Ap-TSP with approximation rotz(0 ,g 2 _|_(Q,°']y.(i_, 3)2 ■ 



Proof. Let / = {G,cost) be a problem instance of Z\/ 3 -TSP, ^ < (3 < 1. Let 
c = (1 — /3) • 2 • Cmin- For all e € E{G), let cost' {e) = cost(e) — c. Then the TSP 
instance /' = {G,cost') still satisfies the triangle inequality: Let x,y,z be the 
costs of the edges of an arbitrary triangle of G. Then z ^ /3 • (a; + y) holds. Since 
c= {1 — (3) - 2 ■ Cmin ^ — l3) ■ (x + y) it follows that z ^ (3 ■ {x3-y) ^ x 3- y — c 

and thus z — c-^x — c-\-y — c. 

Furthermore we know that a Hamiltonian tour is optimal for I' if and only 
if it is optimal for I. Let iLopt be an optimal Hamiltonian tour for I . Let H be 
the Hamiltonian tour that is produced by the algorithm A on the input I' . Then 
cost'{H) ^ a ■ cost'{Hopt) holds and thus 

cost{H) — n • c ^ a • (cosf(iLopt) — n ■ c). 



® Note that exchanging a path for an edge saves a positive amount of costs because of 
the /3-triangle inequality. 

^ Observe that the approximation ratio tends to 1 with /3 approaching | and it tends 
to a with /3 approaching 1. 
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This leads to 



cost{H) ^ a ■ cost{Hopt) — {a — 1) ■ n ■ c 

= a ■ cost{Hopt) — (a — 1) • n • (1 — /?) • 2 • Cn 



^ a ■ coat{Hopt) — (a — 1 ) • n • (1 — /?) • 2 



1-/3 

2/32 



(1 - 

— Oi * COSti^H opt) (o 1) * * Tl • Cmax 

Let T = {7 ^ 1 I cost{H) ^ 7 • cost(Hopt) ^ n ■ Cmax}- Then, for any 7 G T, 

(1 -/3)2 

cost{H) ^ a • cost{Hopt) - {a - 1) ■ — • 7 • cost{Hopt) 



/?2 



= ( a — 7 • (a — 1 ) 



(1-/3)^ 

/32 



cost (^H opt) ■ 



( 1 ) 



As a consequence 



cost{H) ^ niinmin | 7 , a - 7 • (a - 1 ) • — — • cost(Hopt)- (2) 



This minimum is achieved for 7 = a — 7 • (a — 1) • ^ ■ This leads to 



7 = 



a • 0^ 



/32 + (a - 1) . (1 - /3)2 



(3) 



which completes the proof. 



Corollary 2. for 1 ^ /? < 1 the Christofides algorithm is a (1 + 
approximation algorithm for Ap-TSP. 

Note that Corollary |5| provides the same approximation ratio as Theorem 0 



5 The Cycle Cover Algorithm 

In this section we design a new, special algorithm for Z\^-TSP with (3 < 1. This 
algorithm provides a better approximation ratio than the previous approaches 
for i ^ /3 < |. 

Cycle Cover Algorithm 

Input: A complete graph G = {V,E) with a cost function cost : E — > IR^° 
satisfying the /?-triangle inequality. 

Step 1: Construct a minimum cost cycle cover C = {Ci,... ,Ck} of G, i. e. a 
covering of all nodes in G by cycles of length ^ 3. 

Step 2: For 1 ^ i ^ k, find the cheapest edge {at, bi} in every cycle Ci of C. 
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Step 3: Obtain a Hamiltonian cycle H of G from C by replacing the edges 
{{ci, bi} \1 ^k} by the edges {{5i, a*+i} | 1 ^ ^ A: - 1} U {{bk,ai}}. 

Output: H. 



Theorem 4. For i ^ /3 < 1 the cycle cover algorithm is a (| + | • j^)~ 
approximation algorithm for Ajs-TSP. 

Proof. Obviously the cost of the minimal cycle cover is a lower bound on the cost 
of an optimal Hamiltonian cycle. Such a cycle cover can be found in polynomial 
time EEDI. 

For every cycle Ci, 1 ^ i ^ k, the cheapest edge is removed from Ci. Thus, 
the cost of the remaining edges is at least | • cost{Ci) since every cycle of the 
cover has a length of at least 3. The removed edges are replaced by adjacent 
edges. According to Lemma the costs of these new edges can exceed the costs 
of the removed edges in C by a factor of at most Thus, we have 

cost{H) ^ 0 + ^ • 

which completes the proof. □ 
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Abstract. A A-coloring of a graph G is an assignment of colors from the 
set {0, . . . , A} to the vertices of a graph G such that vertices at distance 
at most two get different colors and adjacent vertices get colors which 
are at least two apart. The problem of finding A-colorings with small or 
optimal A arises in the context of radio frequency assignment. We show 
that the problems of finding the minimum A for planar graphs, bipar- 
tite graphs, chordal graphs and split graphs are NP-Complete. We then 
give approximation algorithms for A-coloring and compute upperbounds 
of the best possible A for outerplanar graphs, planar graphs, graphs of 
treewidth k, permutation and split graphs. With the exception of the 
split graphs, all the above bounds for A are linear in A, the maximum 
degree of the graph. For split graphs, we give a bound of A < +2A+2 

and show that there are split graphs with A = Similar results 

are also given for variations of the A-coloring problem. 



1 Introduction 

Radio frequency assignment is a widely studied area of research. The task is to 
assign radio frequencies to transmitters at different locations without causing 
interference. The problem is closely related to graph coloring where the vertices 
of a graph represent the transmitters and adjacencies indicate possible interfer- 
ences. 

In Griggs and Yeh introduced a problem proposed by Roberts which 
they call the L(2, l)-labeling problem. It is the problem of assigning radio fre- 
quencies (integers) to transmitters such that transmitters that are close (distance 

2 apart) to each other receive different frequencies and transmitters that are very 
close together (distance 1 apart) receive frequencies that are at least two apart. 
To keep the frequency bandwidth small, they are interested in computing the 
difference between the highest and lowest frequencies that have been assigned to 
the radio network. They call the minimum difference of the range of frequencies, 
A . The problem is then equivalent to assigning an integer from {0, . . . , A} to the 
nodes of the networks satisfying the L{2, l)-labeling constraint. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 395-|52SI 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Subsequently, different bounds of A were obtained for various graphs. A com- 
mon parameter used is A, the maximum degree of a graph. The obvious lower 
bound for A is Z\ -I- 1, achieved for the tree ATi,zi. In m it was shown that for 
every graph G, X < A^+2A. This upperbound was later improved to A < A^ + A 
in |e|. 



For some special classes of graphs, tight bounds are known and can be com- 
puted efficiently. These include paths, cycles, wheels and complete A:-partite 
graphs m , trees Ena, cographs 0, A:-almost trees m, cacti, unicycles and 
bicycles El, and grids, hexagons and hexagon-meshes 0] . Other types of graphs 
have also been studied, but only approximate bounds are known for them. These 
are chordal graphs and unit interval graphs m, interval graphs 0 , hypercubes 
I, bipartite graphs m, and outerplanar and planar graphs El 






In this paper, we extend the upperbounds of A to other graphs and also 
improve some existing bounds for some classes of graphs. Precisely, new bounds 
are provided for graphs of treewidth k, permutation graphs and split graphs. 
We also improve the bounds in m for planar graphs and outerplanar graphs. 
Efficient algorithms for labeling the graphs achieving these bounds are also given. 
With the exception of split graphs, all the above bounds are linear in A. For 
split graphs, we give a bound ofA< A^'^ + 2A + 2 and show that there are split 
graphs with A = This is the first bound for A that we know of that is 

neither linear in A nor A^. 



In El) it shown that determining A of a graph is an NP-Complete 
problem, even for graphs with diameter two. And in El, it was further shown 
that it is also NP-Complete to determine if A < A: for every fixed integer fc > 4 
(the case when A < 3 occurs only when G is a disjoint union of paths of length 
at most 3). In this paper, we show that the problem remains NP-Complete when 
restricted to planar graphs, bipartite graphs, chordal graphs and split graphs. 

The L(2, l)-labeling problem proposed by Roberts is basically a problem 
of avoiding adjacent-hand interferences - adjacent bands must have frequencies 
sufficiently far apart. There are several variations of the A-coloring problems 
in the context of frequency assignment in multihop radio networks. Two other 
common type of collisions (frequencies interference) that have been studied are: 
direct and hidden collisions. In direct collisions, a radio station and its neighbors 
must have different frequencies, so their signals will not collide (overlap). This 
is just the normal vertex-coloring problem with its associated chromatic number 
A(G). In hidden collisions, a radio station must not receive signals of the same 
frequency from any of its adjacent neighbors. Thus, the only requirement here 
is that for each station, all its neighbors must have distinct frequencies (colors), 
but there is no requirement on what the color of the station itself. 

In [322|) the special case of avoiding hidden collisions in multihop radio 
networks were studied. We call this the L(0, l)-labeling problem (this notation 
was not used in PE2]). In the problem is to avoid both direct and hidden 

collisions in the radio network. Thus, a station and all of its neighbors must all 
have distinct colors. This is called L(l, l)-labeling in |25. It is also known as 
distance-2 coloring problem and is equivalent to the normal coloring of the square 
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of a graph, G^, and has also been well-studied. These variations of A-coloring 
are NP-Complete even for planar graphs |^. 

Perhaps a more applicative term for all these A-variations is the one given 
by Harary m-- Radio-Coloring. We apply our algorithms to these variations as 
well and obtain similar bounds. 

The paper is organized as follows. We give some definitions of graphs and 
generalizations of the A-coloring problem in the next section. Then different 
upperbounds and algorithms for outerplanar graphs, planar graphs, graphs of 
treewidth k, permutation graphs and split graphs are presented in Section 21 
The complexity results for planar graphs, bipartite graphs, chordal graphs and 
split graphs are given in Section ^ Finally, in the last section we mention some 
open problems. 

2 Preliminaries 

2.1 A-Coloring 

Let G = {y, E) be a graph with vertex set V and edge set E. The number of 
vertices in G is denoted by n and the maximum degree of G by Z\. 

Definition 1. Let G be a graph and di,c ?2 be two non-negative integers. A A- 
coloring is an assignment of colors from a set {0, . . . , A} to the vertices of the 
graph. The X-coloring satisfies the L(di, d 2 )- constraint if each pair of vertices 
at distance i,l < i < 2, in the graph gets colors that differ by at least di. 
If a X-coloring of G satisfies the L{di,d 2 )~constraint, then we say that G has 
an L{d\,d 2 ) -labeling. The minimum value X for which G admits a X-coloring 
satisfying the L{d\,d 2 )- constraint is denoted by Xd^^d^iG), or, when G is clear 
from the context, by Xdi,d 2 - 

In this paper, we shall focus mainly on particular L(c?i, d 2 )-labelings which 
have been studied in the literature: L{2, l)-labeling (^), L{1, l)-labeling ( |2l2f)j 1 
and L(0, l)-labeling ( |3I22 | ) . 

Fact 1 . For any graph G, the following lower bounds hold: 

1 - -^o,! > ^ — 1 |0]j 

2 . Ai4 > 123, 

3 . A24 > A -\- 1 1161 . 

All these bounds are easily obtained by considering the tree Ki^a, which is 
contained in any graph of maximum degree A. 

2.2 Special Graphs 

Definition 2. A fc-tree is a graph of n > k -\- 1 vertices defined recursively as 
follows. A clique of k-\-l vertices is a k-tree. A k-tree with n-|- 1 vertices can be 
formed from a k-tree with n vertices by making a new vertex adjacent to exactly 
all vertices of a k-clique in the k-tree with n vertices. 
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Definition 3. A graph is a partial k-tree if it is a subgraph of a k-tree. 

Definition 4. The treewidth of a graph is the minimum value k for which the 
graph is a subgraph of a k-tree. 

A useful way of dealing with the treewidth of a graph is via its tree-decomposition. 

Definition 5. A tree decomposition of a graph G = (V, E) is a pair {{Xi \ i G 
I},T = (I,F)) with {Xi \ i G 1} a collection of subsets ofV, and T = (I,F) a 
tree, such that 

— V}^^IX^ = v 

— for all edges {v, w) G E there is an i G I with v,w G Xi 

— for all i,j, k G I : if j is on the path from i to k in T , then Xi H Xk C Xj. 

The width of a tree decomposition {{Xi \ i G I},T = {I,F)) is max^g/ \Xi\ — 
1. The treewidth of a graph G = {V, E) is the minimum width over all tree 
decompositions of G. 

It can be shown that the above definitions of treewidth are equivalent and 
that every graph with treewidth < A: is a partial fc-tree and conversely, that 
every partial fc-tree has treewidth < k. For more details on treewidths, fc-trees 
and other equivalent definitions, consult, for example, 1211. 

We now define a few more special graphs. uni and [3 are good references for 
other definitions and results concerning these special graphs. 

Definition 6. A graph is Chordal or Triangulated iff every cycle of length > 4 
has a chord (i.e., there is no induced cycle of length >4/ 

Definition 7. A vertex of a graph G is simplicial if its neighbors induce a clique. 

Fact 2. HH Let G be a chordal graph and \V\ = n. There is an elimination 
scheme(^on ordering of vertices) for G, [ui, . . . , u„], such that for each i, 1 < i < 
n, the vertex Vi is a simplicial vertex in the subgraph induced by [rii+i, . . . ,u„]. 
Such an elimination scheme is called a perfect elimination scheme. 

Definition 8. A split graph is a graph G of which the vertex set can be split into 
two sets K and S, such that K induces a clique and S induces an independent 
set in G. 

A permutation graph can be obtained from a permutation tt = [tti . . . 7r„] 
of integers from 1 to n in the following visual manner. Line up the numbers 
1 to n horizontally on a line. On the line below it, line up the corresponding 
permutation so that is right below i. Now connect each i and tTj such that 
TTj = i with an edge. Such edges are called matching edges and the resulting 
diagram is referred to as a matching diagram. The inversion graph is the graph 
Gt^ = (V, E) with V = {1, ... ,n) and {i,j) G if iff the matching lines of i and j 
in the matching diagram intersect. Formally, one can define a permutation graph 
as follows. 
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Definition 9. Let tt = [tti . . . 7t„] be a permutation of integers from 1 to n. 
Then the permutation graph determined by tt is the graph = (V, E) with 
V = n} and (i,j) £ E iff {i — < 0, where is the 

inverse of TTi (i.e., the position of the number i in the sequence tt). A graph G is 
a permutation graph if there exists a permutation tt such that G is isomorphic 
to the inversion graph G,r- 

3 Bounds and Algorithms 

We use the following heuristic, or small modifications of it, to A-color graphs G 
First we find an elimination sequence, an ordering of the vertices, 
[ui, . . . ,Vn], satisfying certain conditions for G. In order to do this, we rely on 
the fact that all the graphs considered have the hereditary property, i.e., when 
a special vertex is eliminated from a graph considered, the induced subgraph 
remains the same type of graph. Then we simply apply the greedy algorithm 
to color each vertex in the sequence by using the smallest available color in 
[0, . . . , A], satisfying the L(di, d 2 )-constraint. For each graph G, we estimate the 
total number of vertices at distance two that a vertex can have among the vertices 
that have been colored so far. Finally we compute the upperbound for A. 

3.1 Outerplanar and Planar Graphs 

We first give an algorithm for labeling an outerplanar graph and then use the 
result to obtain a bound for planar graphs. We need the following lemma. 

Lemma 1. In an outerplanar graph, there exists a vertex of degree at most one 
or a vertex of degree two which has one neighbor that has at most 4 neighbors. 

Proof. First, we show that a biconnected outerplanar graph G has a vertex of 
degree two with a neighbor of degree at most four. (We ignore trivial cases like a 
graph with one vertex.) The inner dual G* of biconnected outerplanar graph G 
is formed by taking the dual of G and then removing the vertex that represents 
the outer region. It is easy to see that G* is a tree CHI Note that all leaves in 
G* correspond to a face with at least one vertex of degree two in G. Consider 
two leaves, u and v, of the inner dual with maximum distance in G*. Suppose 
to the contrary that all neighbors of a vertex of degree < 2 have more than 4 
neighbors in the G. The face represented by u has a vertex x of degree two, 
and both neighbors x and y oi u have degree more than v. Thus, there are at 
least four faces adjacent to x, and these form a path with length at least three 
starting from u in G* . Similarly, there is a different path of length three starting 
from u in G* for the faces that contain y. One now can observe that there is a 
path in G* of greater length than the path from u to v, contradiction. Hence G 
has a vertex of degree two that has a neighbor of degree at most four. 

Finally, every outerplanar graph G = {V, E) is a subgraph of a biconnected 
outerplanar graph H = (V, F) A vertex v that has degree at most two and 
has a neighbor of degree at most 4 in 77 has degree at most one in G or has 
degree two and a neighbor of degree at most 4 in G. □ 
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Algorithm 1. Algorithm for Outerplanar Graphs 



For i := 1 to n 

Find a vertex Vi of degree < 2 with a neighbor of degree at most 4 
or of degree < 1 . 

If neighbors of vi are not adjacent 

Then add a virtual edge between them 
Temporarily remove Vi from G. 

For z := n to 1 

Label vi with the smallest available color in {0, . . . , A} 
satisfying the L(2, 1) -constraint 
If neighbors of vi have a virtual edge 
Then remove the edge 



In Algorithm n we first find an elimination sequence, [di, . . . , u„], using the 
condition in Lemma H Note that this can always be done due to the hereditary 
property of outerplanarity. Then, we use this order to color the vertices in a 
greedy manner, i.e., we color a vertex Vi with the first available color such that 
the color of vertex Vi differs by at least two from the colors of its already colored 
neighbors, and differs by at least one from the colors of already colored vertices 
at distance two. In this way, we make sure that the L(2, 1)-constraint is fulfilled. 
The virtual edge (if there is one) is there to guarantee that the colors of vfs 
two neighbors are different when they are colored. Note that an operation that 
removes Vi and adds a virtual edge between its two neighbors does not increase 
the maximum degree A. 

We can now compute the value of the maximum color used in Algorithm ^ 

Theorem 3. There is an algorithm for finding an L(2,l) -labeling of an outer- 
planar graph with A 2 ,i < A -h 8. 

Proof. We use induction. The first vertex Vn can simply be colored with color 0. 
Suppose we have colored the vertices in the elimination sequence [fi+i, . . . , Vn], 
i < n. When we want to color the vertex Vi, vt can have at most two colored 
neighbors. First, suppose Vi has two colored neighbors. Each of these two neigh- 
bors can account for at most 3 more colors: if we color one of these nodes by 
color c, then colors c — 1 and c -I- 1 are forbidden for Vi . This means possibly six 
colors that are unavailable for coloring Vi . Now Vi has at most A — 1 -|- 3 vertices 
at distance two, which means another A -|- 2 colors that possibly cannot be used 
for Vi- If there are at least A -|- 9 colors, then there is always at least one color 
available for v, i.e., A 2 ,i < A -|- 8. A similar analysis can be used if Vi has one 
colored neighbor. □ 



Corollary 1. There is an algorithm for finding an L(2A)~lo,beling of a triangu- 
lated outerplanar graph with A 2 ,i < A -|- 6. 
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Proof. In a triangulated graph, there are at most A — 2 + 2 = A distance-2 
neighbors of Vi. The total number of colors needed is then Z\-|-7 or A2,i < Z\-l-6. 

□ 

This improves the bound of A2,i < 2Z\ -|- 2 in m for outerplanar graphs, for 
sufficiently large A. 

We now use the above bound for outerplanar graphs to obtain a bound for 
planar graphs. 

Theorem 4. There is an algorithm for finding an T(2, l)-labeling of a planar 
graph with A2.1 < 3Z\ -I- 28. 

Proof. Any planar graph can be partitioned into layers in the following way: all 
vertices at the exterior face are in layer 0, and all vertices not in layers 0 , . . . , f 
that are adjacent to a vertex in layer i are in layer i + 1. Each layer induces an 
outerplanar graph. (See p.) Now, we can color the vertices on layers 0, 3, 6, etc. 
with colors 0, . . . , Z\-|-8; vertices on layers 1, 4, 7, etc. with colors A-l-10, . . . , 2Z\-|- 
18; and vertices on layers 2, 5, 8, etc. with colors 2Z\-|-20, . . . , 3A-I-28; each layer 
is colored with the algorithm for outerplanar graphs. Because of the separations 
between layers, this is an L(2, l)-coloring. □ 



Corollary 2. For any triangulated planar graph, A2,i < 3Z\ -I- 22. 

This improves the bound of A2,i < 8Z\ — 13 for planar graphs from 8| . for 
sufficiently large A. 

Theorem 5. For outerplanar graphs, there are polynomial time algorithms for 
labeling the graphs such that Ao,i < Z\ -|- 2 and Ai.i < Z\ -|- 4. 

Proof. We apply the same algorithm as in Algorithm Q to find an eliminating 
sequence ,vi] and then greedily color each Vi in the sequence with the 

smallest available color satisfying the L(0, 1 (-constraint. As in the proof of The- 
orem 0 each Vi in the eliminating sequence has at most A + 2 neighbors at 
distance-2, which must have different colors from Vi. Now Vi can have at most 
two neighbors, but they can have the same color as Vi. So we can color Vi and 
its neighbors with an extra color. The bound for Ao,i now follows. 

Similar argument applies for A14. □ 



Theorem 6. For planar graphs, there are polynomial time algorithms for label- 
ing the graphs such that Aq,i < 2Z\ -|- 5 and Ai.i < 3Z\ -I- 14. 

It is not hard to see that the algorithms mentioned above can be implemented 
to run in time O(nA), n the number of vertices. 
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3.2 Graphs of Treewidth k 

For a graph G = {V, E) of treewidth k, we first take a tree-decomposition 
{{Xi I i G 7},r = (/, F)) with {Xi | i G /} a collection of subsets of V, 
and T = (/, F) a tree. Let d{v, w) be the distance between vertices v and w. 

Algorithm 2. Algorithm for Graphs of Treewidth k 



1: Add a set of virtual edges E' : 

Here (u, w) G E' iff {v, w) ^ E and 3i : v,w G Xi. 

2: Find a Perfect Elimination Sequence: [ui, . . . , Vn] in G' = (V, F U E'). 

3: For i := n to 1 

Label Vi with the smallest available color, such that for all 
already colored vertices w: 

If {v, w) G E, then the color of v and w differ by at least 2, 

If (v,w) G E' , then the color of v and w differ by at least 1, 

If d{v,w) = 2, then the color of v and w differ by at least 1. 



Theorem 7. There is a polynomial time algorithm for labeling a graph G of 
treewidth k with A 2 ,i < kA -|- 2k . 

Proof. After all the virtual edges are added, the new graph G' = {V,EU E') is 
chordal, hence has a perfect elimination sequence (from Fact 01 • Also, G' has 
the same tree decomposition as G so has treewidth at most k. When we are 
ready to color a vertex v, v has at most k colored neighbors because v with its 
neighbors forms a clique in G' . Now by the perfect elimination sequence property, 
an already colored vertex at distance two of v must be adjacent to an already 
colored neighbor of v in G' . Hence, at most 3fc colors are unavailable due to the 
neighbors of v, and at most {A — l)k colors are unavailable due to the vertices at 
distance two. If we have kA + 2k+ I colors, then a color for v is always available. 
The bound now follows. □ 



Corollary 3. There is a polynomial time algorithm for labeling a k-tree with 
^2,1 < kA — k^ + 3k. 

Proof. As a fc-tree is always triangulated, there are at most k{A — 1 — (A: — 1)) = 
kA — k^ distance-2 neighbors of Vi. The total number of colors needed is then 
kA — -|- 3fc -|- 1 or A 2 ,i < kA — k"^ + 3k. □ 

We can apply our algorithm of treewidth k to the other A-variants. 

Theorem 8. For graphs of treewidth k, Aq,i < kA — k and Ai^i < kA . 
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The labeling algorithms given in this section can be implemented in time 
0{knA), assuming that we are given the tree-decomposition, which can be found 
in linear time for treewidth of constant size k 0 . 

Recently, in it has been shown that Aip can be computed in polynomial 
time for graphs with constant treewidth k. A similar argument would yield the 
result for Aq,i as well. 



3.3 Permutation Graphs 

Theorem 9. There is a polynomial time algorithm for labeling a permutation 
graph with A 2 ,i < 5Z\ — 2 . 

Proof. Suppose the vertices are numbered 1, 2, . . . , n, and we have a permutation 
7T, with {i,j) G if iff (i — — 7r“^) < 0 (i.e., the lines cross). We color the 

vertices from 1 to n in order, using the smallest color available satisfying the 
usual L(2, l)-constraint. To show that the stated bound is sufficient, we make 
use of the following two claims. 

Suppose we are in the midst of this algorithm, ready to color a vertex v. Let 
w be a vertex at distance 2 from v. Note that v and w can have distance two 
via a path across a colored vertex or via a path across an uncolored vertex. 

Claim. Suppose there is a path [u, y, w] with y a vertex that is not yet colored. 
Let X be the neighbor of v such that 7r“^ is minimal. Then either w is a neighbor 
of u or a;, or w is not colored. 

Proof. Suppose w is not a neighbor of v and w is already colored. Then we have 
w < V and < 7r“^. As y is not yet colored, y > v, hence 7 t“^ < 7t“^. Also 
y > V > w, so {y,w) is an edge and 7 t“^ < 7t“^. By assumption, 7 t“^ < 7t“^, 
7r“^ < 7r“^. Now x > v > w, so > 7t“^. Therefore (w,x) is an 

edge. □ 

The proof of the next claim is similar to the proof above. 

Claim. Suppose there is a path [u, y, w] with y a vertex that is already colored. 
Let X be the neighbor of v with x having a minimal 7r“^ among the neighbors. 
Then either w is a neighbor of v or x, or w is not colored. 

We now count the total number of colors needed. The vertices x in the above 
two claims are adjacent to v, so they both have at most A — I neighbors that 
are at distance two to v. We thus get a total of 3A -I- 2(Z\ — 1) -I- 1 = 5Z\ — 1 
colors. □ 



Theorem 10. For permutation graphs, there are polynomial time algorithms for 
labeling the graphs such that Ao,i < 2Z\ — 2 and Aip < 3Z\ — 2. 

All the above algorithms can also be implemented in 0{nA) time. 
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3.4 Split Graphs 

So far all the bounds for A that we have obtained are linear in A. For split 
graphs we give a non-linear bound for A and show that there are split graphs 
that require this bound. 

Theorem 11. There is a polynomial time algorithm for labeling a split graph 
with A 2 ,i < A^'^ ~\~ 2A + 2 . 

Proof. Let S be the independent set and K the clique that split G. \K\ < Z\-l- 1. 
We use colors 0, 2, . . . , 2Z\ to color the vertices in K . For S, we will use colors 
from the set {2Z\-|- 2, . . . , A^'^ + 2Z\-|- 2}. If [S'! < A^'^, we just give every vertex 
in S' a distinct color and we are done. Suppose |S| > A^-^. We claim that there 
is always a vertex u in S with degree < For if this is not the case, then all 
vertices in S have degree > This implies that the total number of edges 
emanating from S is greater than Z\^. By the Pigeonhole Principle, there would 
be some vertex in the clique K that has degree > Z\; a contradiction. Now, let 
u be a vertex in S with degree < A^'^. Recursively color the graph obtained by 
removing v from G. v has at most A^'^ vertices in S at distance two, but as we 
have A^-^ -I- 1 colors to color S, we always have a color left for v. Finally, note 
that adding v back cannot decrease the distances between other vertices in G as 
the neighborhood of u is a clique. □ 



Theorem 12. For split graphs, there are polynomial time algorithms for labeling 
the graphs such that Aq,i < A^'^ and Ai^ < ® -|- Z\ -|- 1. 

We now show that the above bounds for A is actually tight (within constant 
factor) . 



Theorem 13. For every A, there is a split graph with A 2.1 > Ai.i > Aq.i > 




Proof. Consider the following split graph. We take an independent set of 
vertices. This set is partitioned into ^ |Z\ groups, each consisting 
of A/2> vertices. The clique consists of Z\/3 -f 1 vertices. Note that we have less 
than (y^|z\)^/2 = Z\/3 distinct pairs of groups. For each such pair of groups, 
we take one unique vertex in the clique, and make that vertex adjacent to each 
vertex in these two groups. In this way, the maximum degree is exactly A\ each 
vertex in the clique is adjacent to A/ 2> vertices in the clique and at most 2Z\/3 
vertices in the independent set. 

Now, the resulting graph has diameter two, and any pair of vertices in the 
independent set have distance exactly two. So, in any L(0, l)-labeling of the 
graph (or L(l, 1)- or L(2, l)-labeling), all vertices in the independent set must 
receive different colors. □ 



As split graphs are also chordal graphs ^Sj, the above theorem provides a 
non-linear lower bound for the upperbound of in M- 
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4 Complexity 

Without proofs we mention that it is NP-complete to decide whether A 2 ,i < 10 
for a given bipartite planar graph G = (V,E), and that it is NP-complete to 
decide whether A 2 ,i < \V\ for a given split graph (and hence this problem also 
is NP-complete for chordal graphs.) 

The proofs can be found in The result for planar bipartite graphs uses 
a reduction from 3-coloring 4-regular planar graphs. The result for split graphs 
uses a reduction from Hamiltonian Path with a modification of a technique from 

m- 

The problems whether Ao,i < \V\ and Aiq < |P| are also NP-complete for 
the split graphs and chordal graphs. 

5 Concluding Remarks 

We have given upperbounds of A for some of the well-known graphs. However, 
we lack examples of graphs where these bounds are matched. It should be pos- 
sible to tighten the constant factors in the bounds somewhat. For example, in 
outerplanar graphs, the conjecture is that A 2 ,i < A + 2 and we have A 2.1 < Z\-l-8. 
Also, it is not clear that for planar graphs we need 3Z\ in the bounds. Surely the 
constant 28 in A 2 ,i < 3Z\-f 28 is much too large. Similar comments apply to the 
other graphs studied in this paper as well. 

For graphs of treewidth k, the L(0, l)-labeling and L(l, l)-labeling problems 
are polynomial for constant k. The corresponding problem for L{2, l)-labelings 
appears to be an interesting (but apparently not easy) open problem. The corre- 
sponding problem for interval graphs and outerplanar graphs also remain open. 

It is conjectured in m that A 2 ,i < for any graph. This is true for all 
the graphs that have been studied, but the problem remains open. For chordal 
graphs, has shown that A 2 ,i < O(A^). We have shown for split graphs (a 
special case of chordal graphs) that A 2 ,i = 0(A^-®). What is the best bound for 
chordal graphs? 
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Abstract. We exhibit a relativized world where NP n SPARSE has 
no complete sets. This gives the hrst relativized world where no optimal 
proof systems exist. 

We also examine under what reductions NP C SPARSE can have com- 
plete sets. We show a close connection between these issues and re- 
ductions from sparse to tally sets. We also consider the question as to 
whether the NP n SPARSE languages have a computable enumeration. 



1 Introduction 

Computer scientists study lower bounds in proof complexity with the ultimate 
hope of actual complexity class separation. Cook and Reckhow |CR.79| formalize 
this approach. They create a general notion of a proof system and show that 
polynomial-size proof systems exist if and only if NP = coNP. 

Cook and Reckhow also ask about the possibility of whether optimal proof 
systems exist. Informally an optimal proof system would have proofs which are 
no more than polynomially longer than any other proof system. 

An optimal proof system would play a role similar to NP-complete sets. 
There exists a polynomial-time algorithm for Satisfiability if and only if P = 
NP. Likewise, if we have an optimal proof system, then this system would have 
polynomial-size proofs if and only if NP = coNP. 

* Several proofs have been omitted to conserve space. A full version can be found at 
http: / / www.neci.nj .nec.com /homepages / fortnow / papers. 
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The existence of optimal proof systems remained an interesting open ques- 
tion. No one could exhibit such a system except under various unrealistic as- 
sumptions EEHamHi. Nor has anyone exhibited a relativized world where 
optimal proof systems do not exist. 

We construct such a world by building the first oracle relative to which 
NP n SPARSE does not have complete sets. Messner and Toran LVL'rt)8l give a 
relativizable proof that if an optimal proof system exists than NP n SPARSE 
does have complete sets. 

We also consider whether NP n SPARSE-complete sets exist under other 
more general reductions than the standard many-one reductions. We show sev- 
eral results such as: 

— There exists a relativized world where NP n SPARSE has no disjunctive- 
truth-table complete sets. 

— There exists a relativized world where NP n SPARSE has no complete sets 
under truth-table reductions using o{n/ log n) queries. 

— For any positive constant c, there exists an oracle relative to which the class 
NP n SPARSE has no complete sets under truth-table reductions using 
o(n/ log n) queries and c • logn bits of advice. 

— Under a reasonable assumption for all values of A: > 0, NP n SPARSE 
has a complete set under conjunctive truth-table reductions that ask -r ^ — 
queries and use O(logn) bits of advice. 

The techniques used for relativized results on NP n SPARSE-complete sets 
also apply to the question of reducing sparse sets to tally sets. We show several 
results along these lines as well. 

— Every sparse set S is reducible to some tally set T under a 2-round truth- 
table reduction asking 0{n) queries. 

— Let c be any positive constant. There exists a sparse set S that does not re- 
duce to any tally set T under truth-table reductions using o(n/ logn) queries 
even with c • log n bits of advice. 

— Under a reasonable assumption for every sparse set S and every positive 

constant k, there exists a tally set T and a ctt-reduction from S to T that 
asks queries and O(logn) bits of advice. We can also have a 2-round 

truth-table reduction using queries and no advice. 

We use the “reasonable assumptions” to derandomize some of our construc- 
tions building on techniques of Klivans and Van Melkebeek [Kvivmij) . The as- 
sumption we need is that there exists a set in DTIME[2‘^(”)] that requires 
circuits of size 2^^”^ even when the circuits have access to an oracle for SAT. 
Under this assumption we get tight bounds as described above. 

We also examine how NP n SPARSE compares with other promise classes 
such as UP and BPP in particular looking at whether NP n SPARSE has a 
uniform enumeration. 

The proofs in our paper heavily use techniques from Kolmogorov complexity. 
We recommend the book of Li and Vitanyi [TjV 97] for an excellent treatment of 
this subject. 
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1.1 Reductions and Relativizations 

We measure the relative power of sets using reductions. In this paper all reduc- 
tions will be computed by polynomial-time machines. 

We say a set A reduces to a set B if there exists a polynomial-time computable 
function / such that for all strings x, a; is in A if and only if f(x) is in B. We 
also call this an m-reduction, “m” for many-one. 

For more general reductions we need to use oracle machines. The set A 
Turing-reduces to B if there is a polynomial-time oracle Turing machine M 
such that M^{x) accepts exactly when x is in A. A tt-reduction (truth-table) 
requires that all queries be made before any answers are received. 

A 2-round tt-reduction allows a second set of queries to be made after the 
answers from the first set of queries is known. This can be generalized to fc-round 
tt-reductions but we will not need fc > 2 in this paper. 

We can think of a (one-round) tt-reduction R as consisting of two polynomial- 
time computable functions: One that creates a list of queries to make and an 
evaluator that takes the input and the value of B on those queries and either 
accepts or rejects. We use the notation Qr{x) to denote the set of queries made 
by reduction R on input x. For a set of inputs X, we let Qr{X) = yJx^xQFt{x). 

A dtt-reduction (disjunctive-truth-table) means that M^{x) accepts if any 
of the queries it makes are in B. A ctt-reduction (conjunctive-truth-table) means 
that M^{x) accepts if all of the queries it makes are in B. A q{n)-tt reduction 
is a tt-reduction that makes at most q{n) queries. A btt-reduction (bounded- 
truth-table) is a A:-tt reduction for some fixed k. 

We say a language L is r-hard for a class C if every language in C r-reduces 
to L. If L also sits in C then we say L is r-complete for C. 

All the results mentioned and cited in this paper relativize, that is they hold 
if all machines involved can access the same oracle. If we show that a statement 
holds in a relativized world that means that proving the negation would require 
radically different techniques. Please see the survey by Fortnow |For94j for a 
further discussion on relativization. 



1.2 Optimal Proof Systems 

A proof system is simply a polynomial-time function whose range is the set of 
tautological formulae, i.e., formulae that remain true for all assignments. Cook 
and Reckhow ITTTTTTI developed this concept to give a general proof system that 
generalizes proof systems such as resolution and Frege proofs. They also give an 
alternate characterization of the NP versus coNP question: 

Theorem 1 (Cook-Reckhow). NP = coNP if and only if there exists a 
proof system f and a polynomial p such that for all tautologies (j), there is a y, 
\y\ <pM) and f{y) = (j). 

Cook and Reckhow also defined optimal and p-optimal proof systems. 
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Definition 1. A proof system g is optimal if for all proof systems f , there 
is a polynomial p such that for all x, there is a y such that |?/| < p(|x|) and 
g{y) = f{x). A proof system g is p-optimal if y can be computed in polynomial 
time from x. 

Messner and Toran fM'rOSj building on work of Krajicek and Pudlak |K P80| 
show that if NEE = coNEE then optimal proof systems exist and if NEE = EE 
then p-optimal proof systems exist. Here EE, double exponential time, is equal 
to DTIME[2‘^*^^ )]. The class NEE is the nondeterministic version of EE. 

Messner and Toran !mt^ show consequences of the existence of optimal 
proof systems. 

Theorem 2 (Messner- Toran). 

— If p-optimal proof systems exist then UP has complete sets. 

— If optimal proof systems exist then NP H SPARSE has complete sets. 

Hartmanis and Hemachandra IHH84II give a relativized world where UP does 
not have complete sets. Since all of the results mentioned here relativize, Messner 
and Toran get the following corollary. 

Corollary 1 (Messner- Toran). There exists an oracle relative to which p- 
optimal proof systems do not exist. 

However Messner and Toran leave open the question as to whether a relativized 
world exists where there are no optimal proof systems. Combining our relativized 
world where NP n SPARSE has no complete sets with Theorem|2| answers this 
question in the positive. 

1.3 Reducing SPARSE to TALLY 

A tally set is any subset of 1*. Given a set S, the census function cs(n) is the 
number of strings of length n in S'. A set S is sparse if the census function is 
bounded by a polynomial. 

In some sense both sparse sets and tally sets contain the same amount of 
information but in sparse sets the information may be harder to find. Determin- 
ing for which kind of reductions SPARSE can reduce to TALLY is an exciting 
research area. 

Book and Ko [HKSSj show that every sparse set tt-reduces to some tally set 
but there is some sparse set that does not btt-reduce to any tally set. 

Ko jKo89) shows that there is a sparse set that does not dtt-reduce to any 
tally set. He left open the conjunctive case. 

Buhrman, Hemaspaandra and Longpre give the surprising result 

that every sparse set ctt-reduces to some tally set. Later Saluja |Sal98] proves 
the same result using slightly different techniques. 

Schoning |Sch93j uses these ideas to show that SPARSE many-one reduces 
to TALLY with randomized reductions. In particular he shows that for ev- 
ery sparse set S and polynomial p there is a tally set T and a probabilistic 
polynomial-time computable / such that 
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— If a; is in S' then f{x) is always in T. 

— If a; is not in S then Pr[/(a;) G T] < l/p(|a;|). 

We say that S co-rp-reduces to T. Schoning notes that his reduction only requires 
0(log n) random bits. 



1.4 Complete Sets for NP n SPARSE 

Hartmanis and Yesha first considered the question as to whether the 

class NP n SPARSE has complete sets. They show that there exists a tally 
set T that is Turing-complete for NP n SPARSE. They also give a relativized 
world where there is no tally set that is m-complete for NP n SPARSE. 

We should note that NP n TALLY has m-complete sets. Let Mi be an 
enumeration of polynomial-time nondeterministic machines and consider 

{1^*’”’^^ I accepts in k steps}. (1) 

Also there exists a set in Dp n SPARSE that is m-hard for NP n SPARSE. 
The class Dp contains the sets that can be written as the difference of two NP 
sets. For the NP n SPARSE-hard language we need to consider the difference 
A — B where: 

A = {(a;, 1*, 1^) I Mi{x) accepts in k steps} 

B = {{x, 1*, 1^) I Mi accepts more than k strings of length \x\ in k steps} 

As a simple corollary we get that if NP = coNP then NP n SPARSE has 
complete sets. However the results mentioned in Section ll .‘il implv that one only 
needs the assumption of NEE = coNEE. 

Schoning [SchHd| notes that from his work mentioned in Section 11. dl if the 
sparse set S is in NP then the corresponding tally set T is also in NP. Since 
NP n TALLY has complete sets we get that NP n SPARSE has a complete set 
under co-rp-reductions. The same argument applied to Buhrman-Hemaspaandra- 
Longpre shows that NP n SPARSE has complete sets under ctt-reductions. 

2 NP n SPARSE-Complete Sets 

In this section, we establish our main result. 

Theorem 3. There exists a relativized world where NP H SPARSE has no 
complete sets under many-one reductions. 

Proof. Let Mi be a standard enumeration of nondeterministic polynomial-time 
Turing machines and fi be an enumeration of polynomial-time reductions where 
Mi and fi use at most time nb 

Let t{m) be the tower function, i.e., t(0) = 1 and t{m -f 1) = 
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We will build an oracle A. For each i we will let 

Li(A) = {x I There is some y, \y\ = 2\x\ and {i,x,y) G A}. (2) 

The idea of the proof is that for each i and j, we will guarantee that either 
L{Mf) has more than elements at some input length n or Li(A) is sparse 
and does not reduce Li{A) to L{Mf^). 

We start with the oracle A empty and build it up in stages. At each stage 
m = (i,j) we will add strings of the form (i,x,y) to A where \x\ = n = t{m) 
and \y\ = 2n. For each stage m we will do one of the following: 

1. Put more than r-1 strings into L{Mf-) for some length r, or 

2. Make Li{A) n A" have exactly one string and for some x in A”, have 

X G L,{A) 4^ ff{x) ^ L{M,^). (3) 

By the usual tower arguments we can focus only on the strings in A of length 
n: Smaller strings can all be queried in polynomial-time; larger strings are too 
long to be queried. 

Pick a string z of length 2n2" that is Kolmogorov random conditioned on 
the construction of A so far. Read off 2" strings yx of length 2n for each x in 
A”. Consider B = {{i,x,yx) \ x G A"}. 

If L{Mf) has more than strings of any length r then we can fulfill the 
requirement for this stage by letting A = R. So let us assume this is not the 
case. 

Note that ff{x) for x of length n cannot query any string in B or we 
would have a shorter description of z by describing y^, by x and the index of the 
query made by ff{x). Our final oracle will be a subset of B so we can just use 
/® as the reduction. 

Suppose fj{x) = fj{w) for some x and w of length n. We just let A contain 
the single string {i, x, yx) and /® cannot be a reduction. Let us now assume that 
there is no such x and w. 

So by counting there must be some x G A” such that fj{x) ^ L{Mf). Let 
V = ff{x). We are not done yet since Li{B) has too many strings. 

Now let A again consist of the single string {i,x,yx)- If we still have v ^ 
L{Mf) then we have now fulfilled the requirement. 

Otherwise it must be the case that Mf{v) accepts but Alf{v) rejects. Thus 
every accepting path (and in particular the lexicographically least) of 
must query some string in B — A. Since we can describe v hy x this allows us 
a short description of some y^ given yx for w ^ x which gives us a shorter 
description of z, so this case cannot happen. □ 

Corollary 2. There exists a relativized world where optimal proof systems do 
not exist. 



Proof. Messner and Toran |MT98| give a relativizable proof that if optimal proof 
systems exist then NP n SPARSE has complete sets. □ 
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3 More Powerful Reductions 

In the previous section, we constructed a relativized world where the class 
NP n SPARSE has no complete sets under m-reductions. We now strengthen 
that construction to more powerful reductions. Using the same techniques as well 
as other ones, we will also obtain new results on the reducibility of SPARSE 
to TALLY. 



3.1 Relativized Worlds 

We start by extending Theorem 0to dtt-reductions. We remind the reader that 
the proofs for these and all theorems in our paper can be found in the full version 
as noted in the footnote on the first page. 

Theorem 4. There exists a relativized world where NP H SPARSE has no dtt- 
complete sets. 

The proof of Theorem 0 works for any subexponential density bound. In 
particular, it yields a relativized world where the class of NP sets with no more 
than 2" ' ' strings of any length n has no dtt-complete sets. 

We can handle polynomial-time tt-reductions with arbitrary evaluators pro- 
vided the number of queries remains in o(n/ log n). 

Theorem 5. There exists a relativized world where NP H SPARSE has no 
complete sets under o(ji/ log n) -tt-reductions. 

For sets of subexponential density the proof of Theorem ^yields a relativized 
world where the class of NP sets containing no more than 2" ^ ^ strings of any 
length n, has no complete sets under tt-reductions of which the number of queries 
is at most n“ for some a < 1. 

On the positive side, recall from Section 11.41 that NP n SPARSE has com- 
plete sets under ctt-reductions as well as under co-rp-reductions. 

3.2 SPARSE to TALLY 

The techniques used in the proofs of Theorems 0 El and 0 also allow us to 
construct a sparse set S that does not reduce to any tally set under the type 
of reductions considered. As mentioned in Section 11.31 such sets were already 
known for m-reductions and for dtt-reductions. For o(n/ log n)-tt-reductions we 
provide the first construction. 

Theorem 6. There exists a sparse set S that does not o{n/ \ogn)-tt-reduce to 
any tally set. 

On the other side, 0{n) queries suffice to reduce any sparse set to a tally set. 
Previously, it was known that SPARSE ctt- and co-rp-reduces to TALLY (see 
Section 1 1 ..311 . We give the first deterministic reduction for which the degree of 
the polynomial bounding the number of queries does not depend on the density 
of the sparse set. 
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Theorem 7. Every sparse set S is reducible to some tally set T under a 2-round 
tt-reduction asking 0(n) queries. 

Proof. Schoning ISM shows that for any constant k > 0 there exists a tally set 
Ti and a polynomial-time reduction R such that for any string x of any length 
n 



Pr[R{x,p) e Ti] = 1 

X ^ S ^ Pr[i?(a;, p) G Ti] < (4) 

where the probabilities are uniform over strings p of length O(logn). 

By picking independent samples pi, we have for any x G T’": 

a; G S' => Pr[{y i)R{x, Pi) G Ti] = 1 
x^S ^ Pv[{yi)R{x,pi) G Ti] < 

Therefore, there exists a sequence pi, i = 1, . . . , ^ j ^ , such that 

WxGS'^:xGS^{yi)R{x,pi)GTi. (5) 

Since each pi is of length O(logn), we can encode them in a tally set T 2 from 
which we can recover them using 0( • log n) nonadaptive queries. This way, 

we obtain a 2-round tt-reduction from S to Ti 0 T 2 using 0{n) queries: The 
first round determines the pi ’s, and the second round applies (0) • Since Ti 0 T 2 
m-reduces to a tally set T, we are done. □ 

In Secti 0 n r 4 .ll we will show that under a reasonable hypothesis we can reduce 
the number of queries in Theorem 0 from 0(n) to for any constant fc > 0. 

See Corollary 0 

We do not know whether the NP n SPARSE equivalent of Theorem 0holds: 
Does NP n SPARSE have a complete set under reductions asking 0{n) queries? 
See Section 0 for a discussion. 

4 Reductions with Advice — Tight Results 

Our results in SectionQ pointed out a difference in the power of reductions mak- 
ing o(n/logn) queries and reductions making 0(n) queries. In this section we 
close the remaining gap between o(n/ log n) and 0(n) by considering reductions 
that take some advice. The approach works for both the NP n SPARSE setting 
and the SPARSE-to- TALLY setting. 

4.1 SPARSE to TALLY 

We first observe that Theorem0also holds when we allow the reduction 0(log n) 
bits of advice. 
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Theorem 8. Let c be any positive constant. There exists a sparse set S that does 
not reduce to any tally set T under o{n / \ogn)-tt-reductions that take c • logn 
bits of advice. 

Theorem 0 is essentially optimal under a reasonable assumption as the next 
result shows. 

Theorem 9. Suppose there exists a set in DTIME[2‘^^”)] that requires circuits 
of size even when the circuits have access to an oracle for SAT. Then 

for all relativized worlds, every sparse set S and every positive constant k, there 
exists a tally set T and a ctt-reduction from S to T that asks queries and 

0(log n) bits of advice. 

Proof. Let S' be a sparse set. The construction in the proof of Theorem Q can 
be seen as a ctt-reduction of S to the tally set Ti that makes queries 

and gets 0{n) bits as advice, namely the sequence of pfs, each of length 

£{n) G O(logn). 

We will now show how the hypothesis of Theorem M allows us to reduce the 
required advice from 0{n) to O(logn) bits. 

The requirement the pfs have to fulfill is condition (0). By a slight change 
in the parameters of the proof of Theorem 0 (namely, by replacing k by 2k in 
®). we can guarantee that most sequences pi actually satisfy ©. Since the 
implication from left to right in Q holds for any choice of pfs, we really only 
have to check 

-.x^S^{3i)R{x,pi)^Ti. ( 6 ) 

Without loss of generality, we can assume that QR{E'^)r\Ti = QR{Sr\E^)C\Ti, 
where Qr{X) = {R(x,p) \ x € X and \p\ = ^(|a;|)}. Therefore, we can replace 
(0 by the condition 

'i x € E'^ : x ^ S ^ {3i)R{x,pi) ^ Qk(-S' n 17”). (7) 

Since S is sparse, this condition on the pfs can be checked by a polynomial-size 
family of circuits with access to an oracle for SAT: The circuit has a enumeration 
of the elements of S' H T’" built in, and once a polynomial-time enumeration of 
S n is available, o becomes a coNP predicate. 

Under the hypothesis of Theorem 0, Klivans and Van Melkebeek [KvMDDI 
Theorem 4.2] construct a polynomial-time computable function / that maps 
strings of O(logn) bits to sequences pi such that most of the inputs map to 
sequences satisfying dzj. An explicit input to / for which this holds, suffices as 
advice for our reduction from S to T = T\. □ 

Since we can encode the advice in a tally set and recover it from the tally set 
using O(logn) queries, we obtain the following in the terminology of Theorem 

□ 

Corollary 3. Under the same hypothesis as in Theorem 0 for any constant 
k > 0 every sparse set S is reducible to some tally set T under a 2-round tt- 
reduction askinq -rrr - — queries. 

^ k log n ^ 
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4.2 Relativized Worlds 

Our tight results about the reducibility of SPARSE to TALLY carry over to 
the NP n SPARSE setting. 

Theorem 10. For any constant c > 0, there exists a relativized world where 
NP n SPARSE has no complete sets under o{n/ \ogn)-tt reductions that take 
c ■ log n hits of advice. 

We also note that Theorem E] can take up to n — w(logn) bits of advice. 

Theorem 11. There exists a relativized world where NP H SPARSE has no 
complete sets under dtt-reductions that take n — w(logn) bits of advice. 

On the positive side, we obtain: 

Theorem 12. Suppose there exists a set in DTIME[2^("^] that requires circuits 
of size even when the circuits have access to an oracle for SAT. Then for 

all relativized worlds and all values of k > 0, NP H SPARSE has a complete 
set under ctt-reductions that ask queries and O(logn) hits of advice. 

5 NP n SPARSE and Other Promise Classes 

Informally, a promise class has a restriction on the set of allowable machines 
beyond the usual time and space bounds. For example, UP consists of languages 
accepted by NP-machines with at most one accepting path. Other common 
promise classes included NP n coNP, BPP (randomized polynomial time), 
BQP (quantum polynomial time) and NP n SPARSE. 

Nonpromise classes have simple complete sets, for example: 

{{i,x, 1^) I Mi{x) accepts in at most j steps} (8) 

is complete for NP if Mi are nondeterministic machines, but no such analogue 
works for UP. 

We say that UP has a uniform enumeration if there exists a computable 
function (j) such that for each i and input x, uses time at most \xY and 

has at most one accepting path on every input and UP = UiL(M,^(q). Uniform 
enumerations for the other promise classes are similarly defined. 

It turns out that for most promise classes, having a complete set and a 
uniform enumeration are equivalent. Hartmanis and Hemachandra imm show 
this for UP and their proof easily generalizes to the other classes. 

Theorem 13 (Hartmanis-Hemachandra). The classes UP, NP n coNP, 

BPP and BQP have complete sets under many-one reductions if and only if 
they have uniform enumerations. 

For NP n SPARSE neither direction of the proof goes through. In fact de- 
spite TheoremEl NP n SPARSE has a uniform enumeration (in all relativized 
worlds) . 
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Theorem 14. The class NP n SPARSE has a uniform enumeration. 

In some sense Theorem d is a cheat. In the uniform enumeration, all the 
sets are sparse but we cannot be sure of the census function at a given input 
length. To examine this case we extend the definition of uniform enumeration. 

Definition 2. We say NP n SPARSE has a uniform enumeration with size 
bounds if there exists a computable function (j) such that NP H SPARSE = 
and for all i and n, accepts at most n® strings of length n 

using at most W time. 

Hemaspaandra, Jain and Vereshchagin IH.IVhJI defined a similar extension for 
the class FewP. 

We can use Definition 0to prove a result similar to Theorem Elfor the class 
NP n SPARSE. 

Theorem 15. NP H SPARSE has complete sets under invertible reductions if 
and only if NP n SPARSE has a uniform enumeration with size bounds. 

The promise class NP n SPARSE differs from the other classes in another 
interesting way. Consider the question as to whether there exists a language 
accepted by a nondeterministic machine using time which has at most one 
accepting path on each input that is not accepted by any such machine using 
time . This remains a murky open question for UP and the other usual promise 
classes. 

For NP n SPARSE the situatio n is qu ite different as shown by Seiferas, 
Fischer and Meyer ^FM78| and Zak ^Zak83| . 

Theorem 16 (Seiferas-Fischer-Meyer,Zak). Let the functions t\ and t 2 be 
time-constructible such that ti{n+l) = o{t 2 {n)). There exists a tally set accepted 
by a nondeterministic machine in time t 2 (n) but not in time 0{ti{n)). 

6 Open Problems 

Several interesting questions remain including the following. 

— Theorem |7]which shows that every sparse set reduces to a tally set using 0{n) 
queries does not seem to give a corresponding result for NP n SPARSE- 
complete sets. Is there a relativized world where NP n SPARSE does not 
have complete sets under Turing reductions using 0(n) queries? If we can 
construct the pfs in the proof of TheoremQin polynomial time using access 
to a set in NP n coNP, the answer is yes. However, the best we know is to 
construct them in polynomial time with oracle access to NP'^^. 

— Can we reduce or eliminate the assumption needed for Theorem El Corol- 
lary 0, and Theorem El? If we knew how to construct the pfs from the proof 
of Theorem|S|in polynomial time with O(logn) bits of advice, we could drop 
the assumption. 

— Does NP n SPARSE having m-complete sets imply NP n SPARSE has 
a uniform enumeration with size bounds? Can we construct in a relativized 
world a complete set for NP n SPARSE that is not complete under invert- 
ible reductions? 
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Abstract. We show that there is a set which is almost complete but 
not complete under polynomial-time many-one (p-m) reductions for the 
class E of sets computable in deterministic time 2*‘". Here a set A in a 
complexity class C is almost complete for C under some reducibility r if 
the class of the problems in C which do not r-reduce to A has measure 

0 in C in the sense of Lutz’s resource-bounded measure theory. We also 
show that the almost complete sets for E under polynomial-time bounded 
one-one length-increasing reductions and truth-table reductions of norm 

1 coincide with the almost p-m-complete sets for E. Moreover, we obtain 
similar results for the class EXP of sets computable in deterministic 
time 2P°‘^. 

1 Introduction 

Lutz HH introduced measure concepts for the standard deterministic time and 
space complexity classes which contain the class E of sets computable in de- 
terministic time 2*™. These measure concepts have been used for investigating 
quantitative aspects of the internal structure of the corresponding complexity 
classes. Most of this work focussed on the measure for E, since the majority of 
the results obtained there carry over to the larger complexity classes. For recent 
surveys of the work on resource-bounded measure, see Lutz m and Ambos-Spies 
and Mayordomo 0. 

Lutz’s measure on E does not only allow to measure in E the relative size 
of classes of sets with interesting structural properties - like e.g. the classes of 
complete sets under various reducibilities or of the P-bi-immune sets, i.e., the 
sets which are intractable almost everywhere - but it also leads to important new 
concepts. The most investigated concept in this direction is probably that of a 
weakly complete set introduced by Lutz in nn). While all sets in E can be reduced 
to a complete set (under some given polynomial time reducibility notion), for a 
weakly complete set, Lutz only requires that the class of the reducible sets does 
not have measure 0 in E, i.e., is a non-negligible part of E. 

* Supported by Marie Curie Fellowship ERB-FMBI-CT98-3248. 
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Originally, Lutz introduced weak completeness for the polynomial time many- 
one (p-m) reducibility - the reducibility which is used in most completeness 
proofs in the literature - and he showed that there actually is a weakly p-m- 
complete set for E which is not p-m-complete for E (Lutz cni). In fact, the 
class of weakly p-m-complete sets for E has measure 1 in E (Ambos-Spies, Ter- 
wijn, Zheng |^) whereas the class of p-m-complete sets for E has measure 0 in 
E (Mayordomo whence weak completeness leads to a new large class of 
provably intractable problems. 

The large gap between completeness and weak completeness suggested the 
search for intermediate concepts. A natural candidate for such a concept in the 
context of resource-bounded measure is the following. Call a set A in E almost 
p-m-complete for E if the class of problems which are p-m-reducible to A has 
measure 1 in E, i.e., if the sets in E which are not reducible to A can be neglected 
with respect to measure. Zheng and others (see e.g. |^, Section 7) raised the 
question whether there are almost p-m-complete sets for E which are not p-m- 
complete for E. Here we answer this question affirmatively by constructing a set 
with these properties. 

Our result is contrasted by a result of Regan, Sivakumar and Cai CHI, which 
implies that for the standard transitive polynomial-time reducibilities allowing 
more than one oracle query - like bounded truth-table (btt), truth-table (tt), 
and Turing (T) - completeness and almost completeness coincide. It follows that 
any almost p-m-complete set for E is p-btt-complete for E, whence - in contrast 
to the weakly p-m-complete sets - the class of the almost p-m-complete sets for 
E has measure 0 in E. 

The above results leave the investigation of almost completeness for the poly- 
nomial reducibilities besides many-one which allow only one query. Here we show 
that the almost completeness notions coincide for the reducibilities ranging from 
one-to-one, length-increasing reductions to truth-table reductions of norm 1. 
This parallels previous observations for completeness (see (B| and CD]) and weak 
completeness (see j^). 

The outline of the paper is as follows. In Section 2 we describe the part 
of Lutz’s measure theory for E needed in the paper and we review the limiting 
result on almost completeness by Regan, Sivakumar, and Cai. Section 3 contains 
the proof of our main result, while in Section 4 the relations among the various 
completeness notions are discussed. In Section 5 we consider extensions of our 
results to other complexity classes, and we pose some open problems. 

Our notation is standard. For unexplained notation we refer to |3|. The poly- 
nomial time reductions considered here are general reductions of Turing type 
(p-T), truth-table reductions (p-tt) allowing only non-adaptive queries, bounded 
truth-table reductions (p-btt) in which in addition the number of queries is 
bounded by a constant, and the special case hereof where this constant c is fixed 
(btt(c)). We will represent p-btt (1 (-reductions by a pair of polynomial time com- 
putable functions g and h where g(x) gives the string queried on input x and 
the Boolean function h(x) tells how the answer of the oracle is evaluated. If 
the reduction is positive, i.e., h{x){i) = i for all strings x and all i in {0,1}, 
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we get a p-many-one-reduction (p-m) and in this case we omit h. If in addition 
g is one-to-one (and length-increasing) we obtain a (length-increasing) one-one 
reduction (p-1 and p-l-li). For r in {T, tt, btt, btt(l), m, 1, 1-li} and any set A, 
we let the lower p-r-span of A be the class {B : B <(? A}. 

2 Measure on E and Almost Completeness 

In this section we describe the fragment of Lutz’s measure theory for the class E 
of sets computable in deterministic time which we will need in the following. 
For a more comprehensive presentation of this theory we refer the reader to the 
recent surveys by Lutz |I5 and by Ambos-Spies and Mayordomo 0. 

The measure on E is obtained by imposing appropriate resource-bounds on 
a game theoretical characterization of the classical Lebesgue measure 0 

Definition 1. A betting strategy s is a function s : {0,1}* [OA]- The 

(normed) martingale ds ■ {0, 1}* — > [0, oo) induced by a betting strategy s 
is inductively defined by ds{X) = 1 and 

ds{xi) = 2 ■ \i — s(cc)| • ds{x) 

for X C {0,1}* and i C {0,1}. A martingale is a martingale induced by some 
strategy. A martingale d succeeds on a set A if 

lim sup d{A \ n) = oo, 

n — »-oo 

and d succeeds on a class C if d succeeds on every member A of C. 

It can be shown that a class C has Lebesgue measure 0, g,{C) = 0, iff some 
martingale succeeds on C. So, by imposing resource bounds, the martingale 
concept can be used for defining resource-bounded measure concepts. 

Definition 2. Let t : N ^ N be a recursive function. A t(n) -martingale d is a 
martingale induced by a rational valued betting strategy s such that s{x) can be 
eomputed in t{\x\) steps for all strings x. 

A class C has t{n) -measure 0, p,t(n){TT) = 0, if some t{n) -martingale succeeds 
on C, and C has t{n) -measure 1, p,t(n){C) = 1, if the complement C has t(n)- 
measure 0. 

Note that for i G {0, 1} and for recursive bounds t{n),t'{n) such that t{n) < 
t'{n) almost everywhere, 

Mt(n)(C*) = i /r(/(„)((7) = i => h'id) = i . 

In order to obtain measures for complexity classes, resource-bounded measure 
concepts are defined not for individual bounds but for families of bounds. In 
particular, working with polynomial bounds yields a measure on E. 

^ Our presentation follows |2]. The t(n)-measure defined there slightly differs from the 
original definition by Lutz, but both definitions lead to the same notions of p-measure 
and measure on E. 
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Definition 3. A p-martingale d is a q(n) -martingale for some polynomial q. A 
class C has p-measure 0, p-p{CT) — 0, if pi q(n){C^ — 0 /o?' some polynomial q(n), 
i.e., if some p-martingale succeeds on C, and pip{0) = 1 if pLp(C) = 0. 

A class C has measure 0 in E, /j,(C|E) = 0, if pp{CC\ E) = 0 and C has 
measure 1 in E, /j,(C|E) = 1, if p(C\JE) = 0. 

Lutz H31 has shown that this measure concept for E is consistent. In partic- 
ular, E itself does not have measure 0 in E, namely 

^p(E) yf 0 whence ^(E|E) yf 0 . (1) 

On the other hand, every slice of E has measure 0 in E, namely 

^p(DTIME(2'=")) = 0 whence /i(DTIME(2'=’")|E) = 0. (2) 

Based on the above measure for E we can now introduce the completeness no- 
tions for E which are central for our paper. Here r G {1-li, 1, m, btt(l), btt, tt, T} 
denotes any of the reducibilities introduced at the end of Section d 

Definition 4. aj (Lutz f /by ) A set A is weakly p-r-hard for E if the lower p-r 
span of A does not have measure 0 m E. If, in addition, A is in ^ then A is 
weakly p-r-complete . 

b) (Zheng) A set A is almost p-r-hard for E if the lower p-r span of A has 
measure 1 in E. If in addition, A is in ^ then A is almost p-r-complete. 

Intuitively, a set A in E is weakly p-r-complete for E if its lower span contains 
a non-negligible part of E and it is almost p-r-complete for E if the part of E 
which is not contained in the lower span of A can be neglected. In particular, 
every p-r-complete set for E is almost p-r-complete for E and every almost p-r- 
complete set for E is weakly p-r-complete for E. Moreover, since P has measure 
0 in E by 0, every weakly p-r-complete set is provably intractable. 

After Lutz m demonstrated the existence of weakly p-m-complete sets for 
E which are not p-m-complete for E, weak completeness was extensively studied 
and most relations among the different weak completeness and completeness 
notions have been clarified (see Section d below). 

A severe limitation on the existence of nontrivial almost complete sets is 
imposed by the following observation on classes which have measure 1 in E. 

Theorem 5 (Regan, Sivakumar, and Cai [18J L Let C he a class such that 
p,(C|E) = 1 and C is either closed under symmetric difference or closed under 
union and intersection. Then C contains all o/E. 

Since for r in {btt,tt,T} the lower p-r-span of any set A is closed under 
symmetric difference, this shows that the concept of almost completeness is 
trivial for these reducibilities. 

Corollary 6. For r in {btt,tt,T}, every almost p-r-complete (hard) set for E 
is p-r-complete (hard) for E. 
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In general, however, the lower p-m-span of a set is neither closed under sym- 
metric difference nor under union or intersection, whence the above argument 
does not work for almost p-m-completeness. As an immediate consequence of 
Corollary E] however, almost p-m-complete sets for E must be p-btt-complete 
for E. Since the class of p-btt-complete sets has p-measure 0 (see [H]), this also 
shows that almost p-m-complete sets are scarce. 

Corollary 7. Every almost p-m-complete (hard) set for E is p-htt- complete 
(hard) for E. In particular, the class of the almost p-m-complete sets for E has 
p-measure 0, hence measure 0 in E. 

In fact. Corollary 0 can be strengthened to the following result, which was 
observed first in 0. Here the proof follows easily from a result of Wang (see 
0 Lemma 6.19]), which implies that every set in E can be presented as the 
symmetric difference of two n^-random sets (for any fc > 1). 

Corollary 8. Every almost p-m-hard set for E is p-htt{2)-hard for E. 

Despite these limitations, in the next section we will show that there are 
almost p-m-complete sets for E which are not p-m-complete for E. Moreover in 
Section 0 we will obtain the same results for some other p-reducibilities allowing 
only one oracle query by showing that all these reducibilities yield the same class 
of almost complete sets. 

Our results and proofs will use the characterization of p-measure and mea- 
sure in E in terms of resource-bounded random sets. In the remainder of this 
section we shortly describe this approach (from 0 ) and state some results on 
the measure in E in terms of random sets, which we will need in the following. 

Definition 9. A set R is t(n) -random if no t(n) -martingale succeeds on R. 

For later use, we observe the following trivial relations among random sets 
for increasing time bounds. 

Proposition 10. Let t(n),t'(n) he recursive functions such that t{n) < t'(n) 
almost everywhere. Then every t' (n) -random set is t(n) -random. 

The characterization of the p-measure and the measure in E in terms of 
random sets is as follows. 

Lemma 11 (Ambos-Spies, Terwijn, and Zheng [6]). For any class C, 

(i) = 0 iff there is a number k such that C does not contain any n^- 

random set, and 

(ii) /r(C|E) = 0 iff there is a number k such that CC E does not contain any 
-random set. 

To illustrate how results on the p-measure and measure in E can be rephrased 
in terms of randomness and for later use, we consider Mayordomo’s result that 
the class of DTIME(2'=") -bi-immune sets has p-measure 1 and the extension of 
this result to the class of the DTIME(2'=" )-incompressible sets due to Juedes 
and Lutz (see m or 0 for details). 
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Theorem 12. (a) (Mayordomo m) Every -random set is DTIME(2^”)- 
bi-immune. 

(b) (Juedes and Lutz m) Every -random set is -incompres- 

sible. 

Finally, for the proof of our main theorem in the next section, we will need 
the following instance of the Borel-Cantelli-Lemma for p-measure (see Regan 
and Sivakumar uni for a more general discussion of this lemma). We omit the 
easy proof which works by constructing a martingale which for all k, reserves a 
fraction of 1/2^ of the initial capital for betting on the event that the intersection 
with Dk is empty. 

Lemma 13. Let {Hi, H 2 , . . . } be a sequence of pairwise disjoint finite sets where 
Dk has cardinality k. Assume further that given x, in time 0(2^l“l), firstly, one 
can decide whether x is in for some k and, if so, secondly, one can compute 
k and a list of all strings y < x in Dk- Then every -random set intersects 
almost all of the sets Dk- 



3 An Almost Complete Set Which Is Not Complete 

We now turn to the main result of this paper. 

Theorem 14. There is an almost p-To.- complete set for E which is not p-m- 
complete for E. 

For a proof of Theorem it suffices to show the following lemma. 

Lemma 15. There are sets A and B in Ei such that B^’^A and 

for all -random sets R in EXP, (3) 

Then, for such sets A and B, the set A is almost p-m-complete for E by m and 
Lemma El whereas B is not p-m-reducible to A and thus A is not p-m-complete 
for E. In fact, for this argument it suffices to consider E in place of EXP in 
(0. We will use in Section El however, that the extension proved here will lead 
simultaneously to a corresponding result for the class EXP, i.e., the class of sets 
computable in time 2 p°*^. 

Proof. We construct sets A and B as required in stages. To be more precise we 
choose a strictly increasing function /i : N ^ N with h{0) = 0 and we determine 
the values of A and B for all strings in the interval Ik = {x : h{k) < |a;| < 
h(k-\- 1)} at stage k. Here the function h is chosen to be p-constructible and, for 
technical reasons to be explained below, to satisfy 

(i) k'^ < h{k) (ii) •pfe(/i(A:)) < 2 (ui) pk{h{k)) < h{k -\- 1) (4) 

for all fc > 0 and pk{n) = -\- k. Note that given x, by p-constructibility of h, 

we can compute the index k such that x € Ik, as well as h{k), in poly(|a;|) steps. 
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Before we define stage k of the construction formally, we first discuss the 
strategies to ensure the required properties of A and B and simultaneously in- 
troduce some notation required in the construction. 

In order to ensure we let A sufficiently resemble a p-m-complete set for 
EXP. Let {Ce : e > 0} be an effective enumeration of EXP such that Ce{x) 
can be computed uniformly in -|- e steps, and let 

E = : a; G Ce & e S N} 

be the padded disjoint union of these sets. Then E can be computed in time 2" 
and, for all e, Ce is p-m-reducible to E via 

ge{x) = 

whence E is p-m-complete for EXP. So, if we let 

CODEe = range(ge) 

denote the set of strings used for coding Ce into E, then in order to satisfy (0 
it suffices to meet for all numbers e > 0, the requirement 

: If Ce is n^-random then A n CODEe is a finite variant of if n CODEe . 

In order to meet the requirements we will let A look like E unless the 
task of making B not p-m-reducible to A will force a disagreement. Since E is in 
DTIME(2") this procedure is compatible with ensuring that A is in E as long 
as the strings on which A and E differ can be recognized in exponential time. In 
this context note that the sets CODEe are pairwise disjoint and that, for given 
X, poly(|a;|) steps suffice to decide whether a; is a member of one of these sets 
and if so to compute the unique e with x G CODEe . 

The condition B^^A is satisfied by diagonalization. Let {fk : fc > 1} be an 
effective enumeration of the p-m-reductions such that fk{x) can be computed 
uniformly in pk{\x\) = \x\^ -\- k steps. Then it suffices to meet the requirements 

Rl: 3x e {OAV {B{x) ^ Aifkix))) 

for all numbers A: > 1. We will meet requirement R^ at stage k of the construc- 
tion. 

For this purpose we will ensure that there is a string x from a set of k^ 
designated strings of length h{k) such that B{x) and A{fk{x)) differ, while we 
will let B be empty and let A equal E on 1^ otherwise. We will say that this 
action injures an almost completeness requirement R\ if, for the chosen string 
X, fk{x) is in Ik n CODEe and A and E differ on fk{x). Since A and E agree on 
/feO CODEe otherwise, the conclusion of R^ will fail if and only if the requirement 
is injured at infinitely many stages. 

To avoid injuries we will attempt to diagonalize in such a manner that injuries 
to the first k requirements R\, e < k are avoided. If the function fk is not one- 
to-one on the designated strings or if fk{x) is shorter than x for some designated 
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string X then the diagonalization will not affect A on Ik at all, whence no injuries 
occur. The critical case occurs if, for every designated string x, fk{x) is longer 
than X and element of some of the sets CODEg with e < k. By the former, 
the diagonalization has to make A{fk{x)) differ from the canonical value 0 for 
B(x) (not vice versa, since otherwise we might fail to make B computable in 
exponential time) whence by the latter some injuries may occur. 

By Lemma cni however, we will be able to argue that if Ce is n^-random 
and if there are infinitely many stages at which we are forced to make A{fk{x)) 
differ from B{x) = 0 for some fk{x) in CODEe, then at almost all of these stages 
letting A look like E on the /fc-images of the designated strings will yield the 
desired diagonalization. So for n^-random Ce the requirement R], will be injured 
only finitely often. 

We now give the formal construction. We let B n /q = 0 and An Iq = E H Iq. 
Given fc > 0, stage k of the construction is as follows. We assume that A and 
B have already been defined on the intervals Iq, - ■ ■ and we will specify 

both sets on the interval Ik- For the scope of the description of stage k we call 
the first k'^ strings in Ik the designated strings. The designated strings are the 
potential diagonalization witnesses for requirement R^, i.e., we will guarantee 
B{x) ^ A{fk{x)) for some designated string x. Observe that every designated 
string has length h{k) and is mapped by fk into the union of the intervals /q 
through Ik, as follows by items (i) and (iii) in 0), respectively. 

For the definition of A and B on Ik we distinguish the following four cases 
with respect to the images of the designated strings under the mapping fk ■ Here 
it is to be understood that on R the sets A and B will always look like the set E 
and the empty set, respectively, unless this specification is explicitly overwritten 
according to one of the cases below. Moreover, as the cases are not exclusive, 
always the first applicable case is used. 

Case 1: Some designated string is not mapped to Ik- 

Let X be the least such string. By the preceding discussion, fk{x) is contained 
in some interval Ij with j < k and A(fk{x)) has been defined at some 
previous stage. We let B{x) = 1 — A{fk{x)) (thereby satisfying R\). 

Case 2: Two designated strings are mapped to the same string. 

Let X be the least designated string such that fk{x) = fk{x') for some 
designated string x' ^ x and let B{x) — 1. (Then B{x) = 1 differs from 
B{x') = 0, whereas fk maps x and x' to the same string, whence R\ is met.) 
Case 3: Some designated string is not mapped to the set Ue<fe CODEg. 

Let X be the least such designated string and let A{fk{x)) = 1. (Note that, by 
failure of Case 1, fk{x) is in Ik, and Rf is met since B{x) = 0 by convention.) 
Case 4- Otherwise. 

In this case the k^ designated strings are mapped by fk to k'^ different strings 
in \Je<k CODEe, whence we can let Cfc be the least e < k such that fk maps 
at least k designated strings to CODEe^. Let Jk be the set of the least k 
designated strings which are mapped to CODEe^ and let Fk = {fk{x) : x G 
Jk} be the /fc-image of Jk- Observe that by case assumption all strings in 
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Fk are in Ij.- In case E does not intersect Fk, we let A{y) = 1 where y is 
the maximal element in Fk and we say that is injured at stage k. (Then 
is met, because either there has already been a string x in Jk such that 
B{x) = 0 differs from A{fk{x)) or we enforce such a disagreement for some 
X where fu{x) = y.) 

This completes the construction. It remains to show that the constructed sets 
have the required properties. We first observe that the constructed sets are in 
DTIME(22”). We sketch the proof for the set A and leave the similar proof for 
the set B to the reader. Given a string y, we can compute in time poly(|?/|) the 
index k where y is in R, as well as h{k). Further it takes time 0{k^pk{h{k))) to 
compute the list of all pairs {x, fk{x)) such that a; is a designated string of stage 
k and it takes time polynomial in the length of this list to check which of the 
four cases applies and to determine whether according to this case, A(y) might 
differ from E(y) at all. If not, we simply have to compute E{y). Otherwise, we 
know that either Case 3 applies and y is in A or Case 4 applies, y is the maximal 
string in Ek , and ?/ is in A iff none of the fc — 1 smaller strings in F^ is also in 
E. Using item (ii) in m it is then a routine task to show that A in fact can be 
computed in time 2^". 

It remains to show that the requirements i?g, e > 0, and i?|, k > 0, are met. 
For the latter this is immediate by the comments made in the individual cases 
of the construction. For a proof that the almost completeness requirements R], 
are met, fix e > 0 and for a contradiction assume that Rl fails. Then Ce is n^- 
random and A n CODEe and E n CODEe differ on infinitely many intervals Ik- 
By construction, the latter implies that there are infinitely many stages k where 
i?g is injured. Now, for stages k where Case 4 applies, let Dk = {Pe^iy) ■ y G Ek} 
and let Dk be the inverse image of the first k strings in CODEe 0 R, otherwise. 
Then, using the second item in o, one can easily check that the sequence 
{Dk : fc > 1} satisfies the hypothesis of Lemma IT!^ So we may fix fc such that 
i?e is injured at stage fc - whence in particular Case 4 applies during stage fc 
- and Ce intersects Dk, say z S Ce n Dk- Then, by the latter, ge{z) G if n Fk, 
whence R\ is not injured at fc contrary to the choice of fc. □ 

4 Comparing Completeness Notions 

The polynomial-time reducibilities allowing only one oracle query in the range 
from one-to-one, length-increasing reductions to truth-table reductions of norm 
1 lead to the same class of complete sets for E. Namely, Berman Q has shown 
that every p-m-complete set for E is in fact p-l-li-complete while Homer et al. cni 
have proven that every p-btt(l (-complete set for E is in fact p-m-complete for 
E. Corresponding results for weak completeness have been shown by Ambos- 
Spies et al. @]. By the two following theorems, the same phenomenon occurs for 
almost completeness. Due to space considerations we state these results without 
giving a proof. 

Theorem 16. A set is almost p-m-eomplete for E if and only if it is almost 
p-l-li-complete for E. 
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Theorem 17. A set is almost p-htt(l) -complete for E if and only if it is almost 
p-To.- complete for E. 

Previous results in the literature together with the results of this paper clarify 
most of the relations among the different completeness notions for E. If we 
let C(E,r) denote the class of p-r-complete sets for E, and if ylC(E,r) and 
WC(E,r) denote the corresponding classes of almost and weakly complete sets, 
respectively, the known relations among the classes are summarized in Figure 1. 
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Fig. 1. The figure shows the known relations among the completeness notions 
discussed in this paper. Here ’C’ means that a class is a proper subclass. ’C’ 
indicates that it is not known if the inclusion is strict. All classes contained in 
AC(E,btt) have measure 0 in E, whereas all the weakly complete classes (i.e. 
the third column) are known to have measure one in E. The measure in E of 
the remaining four classes (the complete and almost complete sets for p-tt- and 
p-T-reducibility) is hitherto unknown. 



Note that in Figure 1 the inclusions from top to bottom and from left to right 
are immediate by definition. The two equalities in the first column are due to 
Berman 0 and Homer et al. (see above), while the strictness of the remain- 
ing three inclusions in this column has been established by Watanabe m who 
separated the standard completeness notions for reducibilities which allow more 
than one query. The two equalities in the second column are justified by Theo- 
rems El and El above. It follows with Theorem El that the first three inclusions 
from column 1 to column 2 are proper, while the coincidence of completeness and 
almost completeness for the other three reducibilities follows from Corollary El 
above due to Regan et al. m- This corollary also yields that the last two inclu- 
sions in column 2 are proper. That the class AC(E,btt(l)) is a proper subclass 
of the class AC(E,btt) follows from CorollaryO, since Watanabe has shown 
that there is a p-btt-complete set for E, which is not p-btt (2) -complete. 
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The relations stated in the third column have been established by Ambos- 
Spies et al. in ^ where weak completeness notions are compared. The strictness 
of the first four inclusions between the second and the third column follows 
from the observation that AC(E,btt) has measure 0 in E (Corollary |3) whereas 
WC(E,m) has nonzero measure in E measure 1 in E (0). 

Finally, the question whether the last two inclusions between the second and 
the third column are proper is still open. It has been shown, however, that these 
questions cannot be resolved by relativizable techniques: namely, Allender and 
Strauss P have shown that, relative to some oracle, all n^-random sets are p-tt- 
complete whereas Ambos-Spies, Lempp, and Mainhardt |2| and, independently, 
Buhrman et al. have given oracles relative to which no n^-random set is p-T- 
complete for E. This also shows that the measure in E of the classes of complete 
and almost complete sets for p-tt- and p-T-reducibility is oracle dependent. 

5 Further Results and Open Problems 

In this paper we looked at the concept of almost completeness only for the class 
E of sets computable in linear exponential time. Similar results, however, can be 
obtained for other complexity classes. In particular all of our results can be also 
shown for Lutz’s measure on the class EXP of sets computable in time 2 p°'^. 
The analog of our main theorem (Theorem in this setting follows directly 
from Lemma EHI while analogs of the other results require only minor changes in 
the proofs. The relations among the different completeness notions in Figure 1 
will remain the same if we replace E by EXP. 

The relation between almost p-m-completeness for E and EXP is known 
only in part. While it is well known that, for sets in E, p-m-completeness for 
E and EXP coincide, Juedes and Lutz m have shown that every weakly p-m- 
complete set for E is also weakly p-m-complete for EXP but that there are 
weakly p-m-complete sets for EXP in E which are not weakly p-m-complete for 
E. By refining the technique used in the proof of our main theorem, Ambos-Spies 
has shown that there is an almost p-m-complete set for EXP in E which is not 
almost - in fact not even weakly - p-m-complete for E. We do not know, how- 
ever, whether every almost p-m-complete set for E is also almost p-m-complete 
for EXP. 
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Abstract. We show the following new lowness results for the proba- 
bilistic class ZPP^^. 

— The class AM n coAM is low for ZPP^^ . As a consequence it fol- 
lows that Graph Isomorphism and several group-theoretic problems 
known to be in AM n coAM are low for ZPP^^. 

— The class IP [P/poly], consisting of sets that have interactive proof 
systems with honest provers in P/poly, is also low for ZPP^^. 

We consider lowness properties of nonuniform function classes, namely, 
NPMV/poly, NPSV/poly, NPMVt/poly, and NPSVt/poly. Specifically, 
we show that 

— Sets whose characteristic functions are in NPSV /poly and that have 
program checkers (in the sense of Blum and Kannan are low for 
AM and ZPP^^. 

— Sets whose characteristic functions are in NPMVt/poly are low for 



1 Introduction 

In the recent past the probabilistic class ZPP^^ has appeared in different re- 
sults and contexts in complexity theory research. E.g. consider the result MA C 
ZPpNP p 121 which sharpens and improves Sipser’s theorem BPP C The 
proof in p] uses derandomization techniques based on hardness assumptions m 
Another example is the result that if SAT G P/poly then PH = ZPP^^ pT!lll()| . 
which improves the classic Karp-Lipton theorem. Q Actually, Kobler and Watan- 
abe in ^D] prove that every self-reducible sel0 A in (NP n co-NP) /poly is low for 
ZPP^P, i.e. ZPP^P = ZPP^P. This stronger result is in a sense natural, since 
there is usually an underlying lowness result that implies a collapse consequence 
result like the Karp-Lipton theorem. We may recall here that the lowness result 
underlying the Karp-Lipton theorem is that self-reducible sets in P /poly are low 
for S 2 1^ . 

The notion of lowness was first introduced in complexity theory by Schoning 
in m- It has since then been an important conceptual tool in complexity theory, 
see e.g. the survey paper PS|. 

^ The Karp-Lipton theorem states that if SAT G P/poly then PH collapses to 
^ By self-reducibility we mean word-decreasing self-reducibility which is adequate be- 
cause standard complexity classes contained in EXP have such self-reducible com- 
plete problems. 
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1.1 Lowness for ZPP'^^ 

We recall the formal definition of lowness m For a relativizable complexity 
class C such that for all sets A, A € let Low{C) denote {A \ = C}. 

Clearly, Low{C) is contained in C and consists of languages that are powerless 
as oracle for C. 

Few complexity classes have their low sets exactly characterized. These are 
well-known examples: Low{NP) = NP n co-NP, Low{AM) = AM n coAM |2S|. 
For most complexity classes however, a complete characterization of the low 
sets appears to be a challenging open question. Regarding Low(El^), Schoning 
proved m that AMncoAM is contained in ), implying that Low{AM) C 

Low{S2). This containment is anomalous because AM ^ E2 in some relativized 
worlds m- Indeed, lowness appears to have other anomalous properties: it is 
not known to preserve containment of complexity classes, for example NP C PP 
but NP n co-NP is not known to be in Low{PP). Similarly, NP C MA but 
NP n co-NP is not known to be in Low (MA). Little is known about Low (MA) 
except that it contains BPP and is contained in MA n co-MA m- 

Regarding ZPP^^, it is shown in ^ 0 ] that Low(ZPP'^^) C Low{S2). No 
characterization of Low (ZPP^^) is known. Our aim is to show some inclusions 
in Low (ZPP^^) as a first step. 

We first show in this paper that AM n coAM is low for ZPP^^, i.e. AM n 
CO AM C Low(ZPP'^^). Hence we have the inclusion chain 

Low(MA) C Low(AM) C Low{ZPP^^) C Low{Sl). 

It follows that Graph Isomorphism and other group-theoretic problems known 
to be in AM n coAM Pj are low for ZPP^^ . 

We prove another lowness result for ZPP^^: Let IP[P/poly] denote the 
class of languages that have interactive proof systems with honest prover in 
P/poly. We show that IP [P/poly] C Low (ZPP^^), improving the containment 
IP[P/poly] C Low (iff) shown in Pj. Our proof has a derandomization com- 
ponent in which the Nisan-Wigderson pseudorandom generator m is used to 
derandomize the verifier in the IP[P/poly] protocol. The rest of the proof is 
based on the random sampling technique as applied in imczj. 



1.2 NP/poly n co-NP/poly and Subclasses 

As shown in m, self-reducible sets in (NP n co-NP) /poly are low for ZPP^^ . 
However, there are technical difficulties due to which this result doesn’t seem 
to carry over to NP /poly n co-NP /poly. The best known collapse consequence 
of NP C NP/poly n co-NP/poly (equivalently, NP C co-NP/poly) is PH C 

zpp(rf) Bni. 

In order to better understand this aspect of NP/poly n co-NP/poly the 
authors of HH introduce two interesting subclasses of NP/poly n co-NP/poly 
which we discuss in Section 0 We notice firstly that NP/poly n co-NP/poly and 
the above-mentioned subclasses are closely connected to the function classes 
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NPMV/poly, NPSV/poly, NPMVt/poly, and NPSVt/poly, which are nonuni- 
form analogues of the function classes NPMV, NPSV, NPMVt, and NPSVt 
introduced and studied by Selman and other researchers [2S d. More pre- 
cisely, we note that A € (NP n co-NP)/poly if and only if XA G NPSVt/poly, 
where XA denotes the characteristic function of a language A. Similarly, A € 
NP /poly n co-NP / poly if and only if XA G NPMV /poly. Likewise, NPSV / poly 
and NPMVt/poly capture the two new subclasses of NP/poly ft co-NP/poly 
defined in El- 

We prove the following new lowness results for these classes: 

— We show that self-reducible sets whose characteristic functions are in the 
function class NPMVt/poly are low for S 2 (this result is essentially the low- 
ness result underlying the collapse consequence derived in m Theorem 5.2]). 

— We show that all self-checkable sets — in the program checking sense of 
Blum and Kannan — whose characteristic functions are in NPSV/poly 
are low for AM. 

Several proofs are omitted from this extended abstract. A full version of the 
paper is available as a technical report | 2 | ■ 

2 Preliminaries 

Let E = {0,1}. We denote the cardinality of a set X by ||A|| and the length 
of a string x G E* by |a;|. The characteristic function of a language L C E* is 
denoted hy xl- The definitions of standard complexity classes like P, NP, E, 
EXP etc. can be found in standard books [3,|22j. A relativized complexity class 
C with oracle A is denoted by either or C{A). Likewise, we denote an oracle 
Turing machine M with oracle A by or M{A). 

For a class C of sets and a class T of functions from 1* to A*, let CjT ^21 
be the class of sets A such that there is a set B G C and a function h G T such 
that for all x G E*, 

X G Aa^ (x, G B. 

The function h is called an advice function for A. 

We recall definitions of AM and MA. A language L is in AM if there exist a 
polynomial p and a set B gF such that for all x, \x\ = n, 

a; G A ^ Prob^g^{o,i}p(")[3y, |y| =p(«) : G B] = 1, 

X ^A^ Prob^g^{o.i}p(") Py, |y| = P{n) : {x, y, r) G B] < 1/4. 

A language L is in MA if there exist a polynomial p and a set B G P such 
that for all X, |x| = n, 



X G A^3y, \y\ = p{n) : Prob^g^{o.i}p(") [( 2 ^, y, P G B] > 3/4, 
x^ A^Wy, \y\ = p{n) : Prob^g^{o_i}p(„) [(x, y, r) G B] < 1/4. 
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Notice that we have taken the definition of AM with 1-sided error, known to 
be equivalent to AM with 2-sided error. Definitions for single and multiprover 
interactive proof systems can be found in standard texts, e.g. |21- Let MIP de- 
note the class of languages with multiprover interactive protocols and IP denote 
the class of languages with single-prover interactive protocols. We denote by 
MIP[C] and IP[C] the respective language classes where the prover complexity is 
bounded by FP(C), which is the class of functions that can be computed by a 
polynomial-time oracle transducer with oracle in C. 



3 AM n coAM Is Low for ZPP^^ 

In this section we show that AMfJcoAM is low for ZPP'^^. It follows that Graph 
Isomorphism and a host of group-theoretic problems known to be in AMflcoAM 
P] are all low for ZPP^^. We recall here that it is already known that AMncoAM 
is low for S 2 |2S1 and also for AM |l iSj . 

We notice first that although AM n coAM C ZPP^^ (because AM C coR^^ 
and the equality ZPP = R n coR relativizes) and AM n coAM is low for itself, it 
doesn’t follow that AM n coAM is low for ZPP”^^. As mentioned before, NP n 
co-NP is low for NP but is not known to be low for PP or MA. 

Theorem 1. AM n coAM is low for ZPP'^^. 

Proof. Let L be any set in AM n coAM. We need to show that a given ZPP'^^ 
machine M can be simulated in ZPP^^ . Consider an input x of length bounded 
by n to the machine M. Suppose the lengths of all the queries made to L during 
the computation are bounded by m. Since L S AM ft coAM, it follows from 
standard probability amplification techniques (cf. that there are NP sets 
A and B, a polynomial p, and subsets Sm C {0, of size \\Sm\\ > 

such that for all m and all strings y of length \y\ < m, 
y G L implies 



Vw : {y, w) G A and Vw G Sm ■ {y, w) ^ B 
and y ^ L implies 

Vw : {y, w) G B and Vw S Sm '■ {y, w) ^ A. 

In other words, any string w G Sm can be used as advice to decide membership 
in L for strings of length \y\ < m with an NP n co-NP computation. Notice, 
however, that it would be incorrect for us to claim from here that L G (NP H 
co-NP)/poly, because if we use a string from {0,1}^^'"^ — Sm as advice, the 
resulting combination of machines for A and B may not yield an NP n co-NP 
computation for some input y of length \y\ < m. In fact, a string w £ eA™-) is 
a good advice provided that it does not satisfy the NP predicate 



3y, I 2 /I <m-. {y,w) G AnB. 
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We now describe the ZPP^^ machine N that simulates the given ZPP'^^ 
machine M on some input x. Machine N first randomly guesses a string w € 
_ gy assumption, w is a good advice with probability at least 1/2, and this 
can be checked with a single query to the NP predicate above. In case w is good, 
N can use w to replace the oracle L with an NP n co-NP computation when it 
simulates M on input x. □ 



Corollary 1. Graph Isomorphism is low for ZPP*^^. 

The above corollary follows since Graph Isomorphism is in AM ft coAM 
The lowness result also holds for various group-theoretic problems known to be 
in AM n CO AM ^ . 

Notice that the previous proof essentially shows that we can simulate AM n 
coAM with an NP n co-NP computation using a random string in a coNP set as 
advice for the computation. This observation combined with the result of m 
(that self-reducible sets in (NP n co-NP) /poly are low for ZPP^ ) immediately 
yields the following corollary. 

Corollary 2. S elf-reducible sets in (AM n coAM) /poly are low for ZPP^^. 

Additionally, we also have the following corollary in the average-case com- 
plexity setting. We first recall the definition of AV (see, e.g. fSl for a de- 
tailed treatment): AV is the class of decision problems A such that for every 
polynomial-time computable distribution there is an algorithm that decides A 
and is polynomial-time on the average for that distribution. 

Corollary 3. If NP C AV then AM n coAM = NP n co-NP. 

The proof uses the assumption NP C AV combined with the fact that for 
any set in AM n coAM a large fraction of strings satisfying a coNP predicate 
are good advice strings, as we have already seen in the proof of Theorem ^ 
Thus, we can guess such an advice string and use an AV algorithm for the 
uniform distribution to verify the coNP predicate. Since the AV algorithm, with 
its running time truncated to a suitable polynomial bound, will still accept many 
good advice strings, we get an NP n co-NP simulation of AM n coAM. This is 
an application of ideas from m- 

4 IP [P/poly] Is Low for ZPP^^ 

The class IP[P/poly] already figures, though implicitly, in the proof of the result 
in that if EXP C P/poly then EXP = MA. We quickly recall the proof: 
Suppose EXP C P/poly. Note that each language in EXP has a multiprover 
interactive protocol in which the provers are in EXP. By assumption, therefore, 
the honest provers can be simulated by polynomial size circuits. Thus the (MIP) 
protocol can be simulated by an MA protocol where Merlin simply sends the 
circuits for the provers to Arthur in the first round. In other words, the proof 
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shows the inclusion chain EXP C MIP [P/poly] C MA. Since the MA protocol is 
a single prover interactive protocol, we also have MIP [P/poly] = IP[P/poly] C 
MA. 

The above collapse consequence result of motivates the study of lowness 
properties of IP [P/poly] . Our next result states that IP[P/poly] C Low (ZPP^^), 
improving the containment IP[P/poly] C Low^E^) shown in Q- Our result 
strengthens the result of that NP sets in P/poly with self-computable wit- 
nesses are low for ZPP^^. IP[P/poly] contains such NP sets, but IP [P/poly] may 
not even be contained in NP. Although IP[P/poly] C MA C AM, IP[P/poly] is 
not known to be closed under complement, and it is not known if IP [P /poly] 
is contained in coAM. Thus, IP[P/poly] C Low(ZPP^^) appears incomparable 
to AM n coAM C Low (ZPP^^) shown in Theorem Cl in the previous section. 
Our result is also incomparable to the result in {2DI that self-reducible sets in 
P /poly are low for ZPP'^^ . An interesting aspect of our proof is that it combines 
derandomization and almost uniform random sampling. 

Theorem 2. IP [P/poly] is low for ZPP^^ . 

The above lowness result easily extends to IP[(NP n co-NP)/poly] by ob- 
serving that the proof relativizes in the following sense: for any oracle set A, 
]sjpiP[P*/poiy] (2 ZPP^^^ 

We conclude this section with another connection to the average-case com- 
plexity setting. 

Theorem 3. //NP C AV and NP C P/poly then PH collapses to A^- 

5 Nonuniform Function Classes and Lowness 

We now study lowness properties of NPMV/poly, NPSV/poly, NPMVt/poly, 
and NPSVt/poly. These are nonuniform analogs of the function classes NPMV, 
NPSV, NPMVt, and NPSV* studied by Selman and other researchers, 
e.g. m These nonuniform classes are interesting because when restricted to 
characteristic functions of sets, NPSVt/poly coincides with (NP nco-NP)/poly 
and NPMV/poly coincides with NP /poly fl co-NP /poly. Likewise, we note that 
the two subclasses of NP /poly n co-NP /poly studied in , namely all sets un- 
derproductively reducible to sparse sets and all sets overproductively reducible 
to sparse sets, also coincide with NPSV/poly and NPMVf/poly, respectively. 

Following Selman’s notation in a transducer is a nondeterministic Turing 
machine (NDTM for short) T with a write-only output tape. On input x, machine 
T outputs y & S* \i there is an accepting path on input x along which y is output. 
Hence, the function defined by T on E* could be multivalued and partial. Given 
a multivalued function / on E* and x G E* we use the notation 

set-f{x) = {y \ f ■■ X 1 -^ y} 

to denote the (possibly empty) set of function values for input x. We recall the 
basic definitions. 
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Definition 1. 0 

1. NPMV is the class of multivalued, partial functions f for which there is a 
polynomial-time NDTM N such that 

(a) f{x) is defined (i.e., set-f{x) ^ 0^ if and only if N{x) has an accepting 
path. 

(b) y G set-f{x) if and only if there is an accepting path of N{x) where y is 
output. 

2. NPSV is the class of single-valued partial functions in NPMV. 

3. NPMVt is the class of total functions in NPMV. 

4-. NPSVt is the class of total single-valued functions in NPMV. 

The classes NPMV /poly, NPSV/poly, NPMVt/poly, and NPSVt/poly are 
the standard nonuniform analogs of the above classes defined as usual for 
T G {NPMV, NPSV, NPMVtNPSVt}, a multivalued partial function / is in 
iF/poly if there is a function g G T, & polynomial p, and an advice function 
h : 1* 1 -^ E* with |h(l"')| = p{n) for all n, such that for all x G E*, 

set-f{x) = set-g((x,/i(ll“l))). 

Before we connect these classes to NP /poly n co-NP / poly and its subclasses 
defined in HH , we recall definitions from HH: Consider polynomial-time nonde- 
terministic oracle machines N whose computation paths can have three possible 
outcomes: accept, reject, or ?. The machine N can also be viewed as a transducer 
which computes, for given oracle D and input x, a multivalued function. More 
precisely, if we identify accept with value 1 and reject with 0, and consider the 
? computation paths as rejecting paths then defines a partial multivalued 
function: set-N^{x) C {0, 1}. Machine is said to be underproductive if for 
each X we have {0, 1} 2 set-N^{x), and N is said to be robustly underproductive 
if for each oracle D and input x we have {0, 1} 2 set-N^ (x). Likewise, is 
overproductive if for each x we have set-N^{x) yf 0, and N is said to be robustly 
overproductive if for each oracle D and input x we have set-N^ (x) yf 0. 

With standard arguments we can convert a sparse set into a polynomial- 
size advice string and vice-versa (see, e.g. 0). It follows that A G NP/poly n 
co-NP /poly if and only if there is a sparse set S and a nondeterministic machine 
N such that is both overproductive and underproductive and A = L{N^). 
Similarly, A G (NP n co-NP) /poly if and only if there is a sparse set S and 
a nondeterministic machine N such that A = L{N^) and N is both robustly 
overproductive and robustly underproductive and A = L(N^). 

Proposition 1. Let \A denote the characteristic function for a set A C E* : 

1. XA is in NPMV/poly if and only if A is in NP/poly n co-NP/poly. 

2. XA is in NPSVt/poly if and only if A is in (NP n co-NP)/poly. 

3. XA is in NPSV /poly if and only if there are a sparse set S and a robustly 

underproductive machine N such that A = L{N^). 

4 . XA is in NPMVt/poly if and only if there are a sparse set S and a robustly 
overproductive machine N such that A = L(N^). 
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By abuse of notation, we identify XA with A in this section. E.g. we write 
A € NPSV/poly when we mean XA G NPSV/poly. We now turn to lowness 
questions for the nonuniform function classes. The classes NP/polyflco-NP/poly 
and (NP n co-NP) /poly are of interest in the context of deriving strong collapse 
consequences from the assumption that NP (or some other hard complexity class) 
is contained in one of these classes. We recall the known collapse consequence 
result shown in m for NP / poly D co-NP /poly under the assumption that NP is 
contained therein: If NP C NP/polyflco-NP/poly then PH collapses to ZPP^2, 
The open question here is whether the collapse consequence can possibly be 
improved to ZPP'^^. This is one reason to consider classes that lie between 
NP/poly n co-NP /poly and (NP f co-NP) /poly. 



5.1 A Lowness Result for NPMVt/poly 

It is shown in HH that if an NP-complete problem is in NPMVt/poly then 
PH collapses to Hf. In P] the authors actually state this result in terms of 
overproductive reductions to sparse sets. We use ideas in their proof to show 
the underlying lowness result for functions: all word-decreasing self-reducible 
functions in NPMVt/poly are low for We first recall the definition of word- 
decreasing self-reducible sets (and define its obvious extension to total single- 
valued functions). 

Definition 2. |B] For strings x,y G S* , x ^ y if \x\ < \y\ or |a;| = \y\ and x 
is lexicographically smaller than y. A set A is word-decreasing self-reducible if 
there is a polynomial-time oracle machine M such that A = L{M^), where on 
any input x the machine M queries the oracle only about strings y such that 
y < X. Similarly, a total single-valued function f on E* is word-decreasing self- 
reducible if there is a polynomial-time oracle transducer T such thatT^ computes 
f, where on any input x, transducer T can query the oracle only about strings y 
such that y ^ X. 

The definition of lowness extends naturally to total, single- valued functions: 
A functional oracle / returns f{x) on query x. For any relativizable complexity 
class C we say that / S Low(C) if = C. We show next that self-reducible sets 
and self-reducible functions in NPMV/poly have identical lowness properties. 
Hence it suffices to prove lowness of self-reducible sets in NPMV /poly. 

Theorem 4. Let T contain all self-reducible functions in any of the four func- 
tion classes {NPMV/poly, NPSV/poly, NPMVt/poly, NPSVt/poly}. Let C be 
the subclass of if consisting of characteristic functions (making C a language 
class, essentially). For every self-reducible function f G if there is a self-reducible 
set A G C such that f and A are polynomial-time Turing equivalent. 

Proof. Given f G J-, we can define the corresponding set A G hy suitably 
encoding, for each x, the bits of f{x) in A. We can easily ensure that the self- 
reducibility of / carries over to A and / and A are polynomial-time Turing 
equivalent. □ 
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Theorem 5. Word- decreasing self-reducible sets in NPMVt/poly are low for 

El 

Since El III C=P, Mod^P, PSPACE, and EXP have many-one com- 
plete word-decreasing self-reducible sets 0, the following corollary is immediate. 

Corollary 4. IfC€ {El Hi PP, C=P, Mod^P, PSPACE, EXP}, for k>l, 
has a complete set in NPMVf/poly then C C Elf and PH = El 

The proof follows since for each C G [El III C=P, Mod^P, PSPACE, 
EXP} and any set A complete for C w.r.t. polynomial-time Turing reductions 
we have Ef C 

We end this section with the observation that AM n coAM is contained in 
NPMVt/poly. It is interesting to now compare the lowness results (Theorems E 
and El) for these classes. 

Proposition 2. If L £ AM n coAM then L is in NPMVt/poly. 

Proof. Given L £ AM C coAM, as already observed in the proof of Theorem d 
there are NP sets A and B, a polynomial p, and subsets Sm Q {0, of size 

||5'm|| > such that for all m and all strings x of length |a;| < m, 

X £ L implies 



\/w : (a;, w) £ A and Vw £ Sm '■ {x, w) ^ B 
and X ^ L implies 

Vw : (x, w) £ B and Vw G Sm '■ {x, w) ^ A. 

We can combine the NP machines Ma and Mb for A and B and build a 
transducer I that on input {x, w) outputs 1 (0) on any accepting simulation 
of Ma (resp. Mb) on input (x,w). Observe that in case w £ Sm transducer I 
will always yield a single- valued, total computation for all inputs x of length m, 
outputing either 1 or 0 depending on the membership of x. On the other hand, 
no matter which w £ {0, is used as advice, {x,w) is either in A or in H 

and so the transducer I always outputs at least one of 0 or 1 for any advice 
string u> G |0, and any input x of length m. Hence it follows that L is in 

NPMVt/poly. □ 

5.2 A Lowness Result for NPSV/poly 

In CH it is left as an open problem to discover new lowness (or collapse con- 
sequence) results for NPSV/poly. As noted in El, nothing better is known 
for NPSV/poly than the collapse consequence result: if SAT is in NPSV/poly 
then PH collapses to ZPP 2 ^ which holds even for the larger class NP / poly C 
co-NP/poly pn| . 

We show that sets in NPSV/poly that are checkable, in the sense of program 
checking as defined by Blum and Kannan |Sj, are low for AM and for ZPP^^. 
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Since 0P, PP, PSPACE, and EXP have checkable complete problems, it follows 
that for any of these classes inclusion in NPSV /poly implies its containment in 
AMncoAM. This result is proved on the same lines as the Babai et al result pj: 
If EXP is contained in P/poly then EXP C MA. 

Recall the definitions of MIP[C] and IP[C] for a class C of languages. We prove 
a technical lemma that immediately yields the lowness result. 

Lemma 1. If A £ NPSV /poly then MIP[A] C AM. 

Proof. Let L £ MIP[A] for some set A £ NPSV/poly. Let T be a nondetermin- 
istic transducer and g be a polynomial witnessing that A £ NPSV/poly. We 
describe an MAM protocol for L: 

1. Let X be an input of length n to the protocol. Let m = p(n), where p is a 
polynomial bounding the size of the queries to A made by the provers during 
the MIP[A] protocol for inputs of length n. 

2. Merlin sends advice w of length q(m) to Arthur. 

3. Arthur sends a polynomial random string r (used for simulating the original 
MIP protocol) to Merlin. 

4. Merlin sends back the list of successive queries to set A (generated by simu- 
lating the original MIP protocol with random string r), the list of answers to 
those queries along with the computation paths of transducer T with advice 
w that certify the answers to the queries. 

5. Arthur can verify in polynomial time that Merlin’s message is all correct 
and accept if and only if the original MIP protocol accepts. 

By the fact that T computes a single-valued partial function for any advice 
w, although the verifier is simulating the nondeterministic transducer T, it is 
guaranteed that each accepting computation path has identical output and hence 
does identical computation. Thus, what makes the above MAM protocol work 
is the fact that for any advice w and query q all accepting computation paths of 
T{q,w) output the same value. So, regardless of which computation paths are 
sent to Arthur by Merlin in Step 4 of the above protocol, Arthur’s decision will 
be the same. In other words, Arthur’s acceptance depends only on the random 
string r, hence exactly preserving the acceptance probability of the original MIP 
protocol. 

Standard techniques (cf. P]) can be used to convert the MAM protocol to 
an AM protocol. This completes the proof. □ 

We have as immediate consequence the following lowness result. 

Theorem 6. If L is a checkable set in NPSV /poly then L £ AM H coAM and 
hence low for AM and ZPP^^. 

Proof. The assumption in the theorem’s statement implies that both L and L 
are in MIP[L] by the checker characterization theorem of jSI. Now, applying 
Lemma n yields that both L and L are in AM and the result follows. □ 



Graph Isomorphism Is Low for ZPP(NP) and Other Lowness Results 441 



We can derive new collapse consequences as corollary, since the classes 0P, 
PP, PSPACE, and EXP all have checkable complete problems. It follows that 
for any of these classes inclusion in NPSV/poly implies its containment in AMn 
CO AM. 

Corollary 5. If any of the classes ©P, PP, PSPACE, and EXP is contained in 
NPSV/poly then it is low for AM and hence PH = AM. 

Notice that we have the same lowness for checkable functions in NPSV /poly. 

Corollary 6. Checkable functions in NPSV/poly are low for AM and ZPP”^^. 

Proof. Let / be a checkable function in NPSV/poly. We can suitably encode, 
for each x, the bits of f{x) in a language A which is polynomial-time Turing 
equivalent to / and hence A is also checkable. The lowness result now follows by 
invoking Theorem El □ 
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Abstract. We study the problem of minimizing the makespan for the 
precedence multiprocessor constrained scheduling problem with hierar- 
chical communications P] . We propose an |- approximation algorithm for 
the UET-UCT (Unit Execution Time Unit Communication Time) hie- 
rarchical problem with an unbounded number of biprocessor machines. 
Moreover, we extend this result in the case where each cluster has m pro- 
cessors (where m is a fixed constant) by presenting an p-approximation 



algorithm where p = (2 



2m -I- 1 



1 Introduction 

Task scheduling is one of the the most important problems in parallel compu- 
tation. In such a context, an application is usually represented as a directed 
acyclic graph where the vertices represent the tasks to be executed and the arcs 
the communication delays. The parallel architecture is composed by a set of 
m identical processors and the problem is to find a feasible scheduling minimi- 
zing the makespan. i.e. the time at which the last task of the graph finishes its 
execution. Formally this problem can be stated as follows: 

Let G = {V,E) he a. precedence graph with n tasks, and let m be the number 
of available processors. Every task i £V has a processing time pi and every arc 
(i, j) is associated with a communication delay . Let ti (resp. tj) be the starting 
time of task i (resp. j), then if i and j are executed on the same processor then 
tj > ti + Pi, otherwise tj > U + pi + Cij . In what follows we call this model the 
scheduling model with homogeneous communieations. 

The objective, is to find a schedule, i.e. an allocation of each task to a time 
interval on one processor, such that the communication delays are taken into 
account and the completion time (makespan) is minimized (the makespan is 
denoted by Cmax and it corresponds to maxigy{ti +Pi\). 

This problem has been extensively studied in the last few years. It is NV-haid 
even for surprisingly simple cases like the well known UET-UCT case where all 
the execution times and the communication delays are unitary and an unbounded 
number of processors is available. Using the notation of |0|, this last case is 
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denoted as P\prec]Cij = l',Pi = IjCmax- It was shown that there is no hope 
to find a heuristic for Pjprec;cij = l;pi = IjCmax with relative performance 
strictly less than 7/6 (unless P = AfV) jl|, and the best known approximation 
algorithm is due to Munier and Konig with a worst-case relative performance 
equal to 4/3 0- 

We consider here an extension of this classical scheduling model which 
takes into account hierarchical communications ra 0 This extension is moti- 
vated by the advance of hierarchical parallel architectures. Parallel architectures 
of this type include parallel machines constituted by different multiprocessors; 
biprocessors connected by myrinet switches, architectures where the processors 
are connected by hierarchical busses, or point-to-point architectures where each 
component of the topology is a cluster of processors. Formally, we are given 
m multiprocessor machines (or clusters) that are used to process n precedence 
constrained tasks. Each machine (cluster) comprises several identical parallel 
processors. A couple (cij,eij) of communication delays is associated to each arc 
(i, j) between two tasks of the precedence graph. In what follows, (resp. e^) 
is called intercluster (resp. interprocessor) communication, and we consider that 
Cij > Sij- If tasks i and j are executed on different machines, then j must be 
processed at least Cij time units after the completion of i. Similarly, if i and j are 
executed on the same machine but on different processors then the processing 
of j can only start units of time after the completion of i. However, if i and j 
are executed on the same processor then j can start immediately after the end 
of i. The communication overhead (intercluster or interprocessor delay) does not 
interfere with the availability of the processors and all processors may execute 
other tasks. Our goal is to find a feasible schedule of the tasks minimizing the 
makespan. 

Notice that the hierarchical model that we consider here is a generalization 
of the scheduling model with homogeneous communication delays. Consider for 
instance that for every arc (i,/) of the precedence graph we have = e^. In 
that case the hierarchical model is exactly the classical scheduling model with 
homogeneous communications. 

We focus on the case where the number of clusters is unrestricted, the num- 
ber of processors within each cluster is equal to two and the intercluster (resp. 
interprocessor) communication is equal to = 1 (resp. tij = 0). 

Using an extension of the classical notation of Lenstra et al. this problem 
is denoted as P, 2|prec; (ci^, e^) = (l,0);pi = l\Cmax- Recently, for the mul- 
tiprocessor scheduling problem with hierarchical communications, it has been 
proved in [2( that there is no hope of finding a heuristic with relative perfor- 
mance strictly less than 5/4 (unless V = NV). This result is an extension of 
the result of Hoogeveen et al. 0, who proved that there is no polynomial time 
p-approximation algorithm with p < 7/6 for the well-known UET-UCT schedu- 
ling problem with homogeneous communication delays (P|prec;Cy = l;pi = 
l\Cmax)- However, in it has been proved that the problem is polynomial if 
the duplication of tasks is allowed. 
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In what follows, we propose a new scheduling algorithm based on a LP- 
relaxation that improves the trivial bound of two. 



1.1 Preliminaries 

Given a precedence graph G = (V,E) a predecessor (resp. successor) of a task i 
is a task j such that (j, i) (resp. (i,j)) is of G. For every task i G V, r~^{i) 

(resp. r~{i)) denotes the set of immediate successors (resp. predecessors) of i. 
We denote the tasks without predecessor (resp. successor) by Z (resp. U). We 
call source every task belonging to Z. 

A schedule cr is a set of n ordered triples {{i, e V}: representing 

that the task i is performed by one of the processors of the cluster Mi at time 
ti- Every feasible schedule must respect the following constraints: 

1. at any time, a cluster executes at most two tasks; 

2. V(z, j) € E, if Mi = Mj then tj > ti + 1, otherwise tj > ti + 2. 

The makespan of schedule cr is: 

= max {ti + 1). 
i G V 



The problem is to find a feasible schedule with a minimum makespan. 

In order to evaluate the worst-case performance of an algorithm, we recall 
the definition of the relative performance of a heuristic h: 



p = max 

G 



CLxiG) 

/^opt 

^max ) 



where Cff’*^{G) denotes the optimal makespan of a feasible schedule of the 
graph G, and G!f^^^{G) the makespan obtained by the heuristic h. 

In the next section, we formulate the problem as an integer linear program 
(ILP). In the third section, we propose a simple heuristic based on a relaxation 
of the ILP that we analyze in the fourth section by evaluating its worst-case 
relative performance. In the last section, we extend this result by providing an 



(2 



2 

2m -I- 1 



(-approximation algorithm and we show that this bound is tight. 



2 The Integer Linear Program 

Let us consider an instance of the problem P, 2|prec; (c^ , ) = (l,0);p* = 

^Gmax given by a directed acyclic graph G = (V,E) with n tasks. 

The aim of this section is to model the above described scheduling problem 
by an integer linear program (ILP) denoted, in what follows, by iT. 

We model the scheduling problem by a set of equations defined on the starting 
times vector (ti, . . . , t„): in every feasible schedule, every task i gV — U has at 
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most two successors, w.l.o.g. call them ji and j 2 € that can be performed 

by the same cluster as i at time =ti + l. 

The other successors of i, if any, satisfy: \/k G T+(i) — {ji, J 2 }, tfc > U + 2. So, 

for every arc {i,j) € E, we introduce a variable Xij G {0,1} and the following 

constraints: ,,/■ -x , 

V(i,j) e E,ti + 1 + Xij < tj 



and 



jer+{i) 



Similarly, every task i of V — Z has at most two predecessors, w.l.o.g. call 
them ji and j '2 G E~(i), that can be performed by the same cluster as i at time 
— iz Ij so. 



E ^ 









If we denote by Cmax the makespan of the schedule, 

Vi G + 1 < Cmax- 



The above constraints are necessary but not sufficient conditions in order to 
get a feasible schedule for our problem. For instance, a solution minimizing Cmax 
for the graph of case (a) in Figure ^ will assign to every arc the value 0. However, 
since every cluster has two processors, and so at most two tasks can be processed 
on the same cluster simultaneously, the obtained solution is clearly not feasible. 
Thus, the relaxation of the integer constraints, by considering 0 < Xij < 1, 
and the resolution of the resulting linear program with objective function the 
minimization of Cmax, gives just a lower bound of the value of Cmax- 

In order to improve this lower bound, we consider every subgraph of G that 
is isomorphic to the graphs given in Figure ^ -cases (a) and (b). It is easy to 
see that in any feasible schedule of G, at least one of the variables associated to 
the arcs of each one of these graphs must be set to one. So, we add the following 
constraints: 



j 1 




i k m 



case (a) 



i k m 




j 1 

case (b) 



Fig. 1. Two cases. 

— For the case (a): 

yi,j,k,l,m G V, such thsit {j,i),{j,k), {I, k), {I, m) G E, Xjz + Xjk + Xik + 
Xlm ^ I 
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— For the case (b): 

Wi,j,k,l,m G V, such that {i,j),{k,j),{k,l),{m,l) G E,Xij + Xkj + xm. + 
Xml ^ 1 

Thus, in what follows, we consider the following ILP: 



(n) 



V(i,j) G E, 

Vz G V, 

V{i,j) G E, 
W{t,j)GE-U, 



V{i,j)GE-Z, 



min Cmax 
Xij G {0, 1} 
> 0 



ti \ Xij < tj 



jGr+(i) 



Vz, j, k, l,mGV, \(j, z), (j, fc), (Z, k), {I, m) G E, Xj^ + Xjk+Xik+Xim > 1 
\/i,j,k,l,mGV,\{i,j), {k,j), (k,l), {m,l)GE, Xij + Xkj+Xki+Xmi>^ 



IVzGP, 



+ 1 < Crr 



Once again the integer linear program given above does not always imply 
a feasible solution for our scheduling problem. For instance, if we consider the 
precedence graph given in Figure 0 the optimal solution of the integer linear 
program will set all the arcs to 0. Clearly, this is not a feasible solution for our 
scheduling problem. However, our goal in this step is to get a good lower bound 
of the makespan and a solution -eventually not feasible- that we will transform 
to a feasible one (this transformation is given below) . 




Fig. 2. Our integer programming formulation does not always imply a feasible 
solution. 

Let denotes the linear program corresponding to 77 in which we relax 

the integer constraints Xij G {0, 1} by setting Xij G [0, 1]. Given that the number 
of variables and the number of constraints are polynomially bounded, this linear 
program can be solved in polynomial time. The solution of TT*”-^ will assign to 

every arc (z, j) G E a value Xij = eij with 0 < Cij < 1 and will determine a lower 

bound of the value of Cmax that we denote by . 

Lemma 1. is a lower bound of an optimal solution for 

7^, 2 |prec, (cjj , ) — (1,0), — ^\Cmax- 
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Proof. This is true since any optimal feasible solution of the scheduling problem 
must satisfy all the constraints of the integer linear program 77. 

□ 



3 Obtaining a Feasible Solution 



The algorithm is divided in two steps: 



1. Step 1 [Rounding]: We transform the solution of the relaxed linear program 
into an integer one in the following way: 
if Cij < 0.25 (resp. e^- > 0.25) then Xij = 0 (resp. Xij = 1). 

In the following, we call an arc {i,j) S E a 0-arc (resp. 1-arc) if Xij = 0 
(resp. Xij = 1). 

The solution given by Step 1 is not necessarily a feasible solution (take for 
instance the precedence graph of Figure EJ, so we must transform it to a 
feasible one. Notice that the cases given in Figure [Hare eliminated by the 
linear program. 

In the next step we need the following definition. 



Definition 1. A critical path with terminal vertex i £V is the longest path 
from an arbitrary source of G to task i. The length of a path is defined as 
the sum of the processing times of the tasks belonging to this path and of the 
values Xij for every arc in the path. 



2. Step 2 [Feasible Rounding]: We change the integer solution as follows: 

(a) If i is a source then we keep unchanged the values of Xij obtained in 
Step 1. 

(b) Let 7 be a task such that all predecessors are already examined. Let Ai 
be the subset of incoming arcs of i belonging to a critical path with 
terminal vertex the task i. 



i. If the set Ai contains a 0-arc, then all the outcoming arcs Xij take 
the value 1. 

ii. If the set Ai does not contain any 0-arc (all the critical incoming arcs 
are valued to 1), then the value of all the outcoming arcs Xij remains 
the same as in Step 1, and all the incoming 0-arcs are transformed 
to 1-arcs. 



Remark: In Step \2(bjii changing the value of an incoming 0-arc to 1 
does not increase the length of any critical path having as terminal vertex 
i, because it exists at least one critical path with terminal vertex i such 
that an arc (j,i) G E is valued by the linear program to at least 0.25 
(cji > 0.25), and so Xji is already equal to 1. 



Feasibility 

From the constraints of the linear program one can easily show the following 
lemma. 
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Lemma 2. Every job i gV has at most two successors (resp. predecessors) such 
that Cij < 0.25 (resp. Cji < 0.25/ 

From the previous lemma, it is clear that after the rounding procedure of 
Step 1, if we focus on the tasks of two arbitrary consecutive levels of G, we can 
easily obtain a feasible schedule of the tasks of these levels, by performing the 
tasks that are connected by 0-arcs on the same cluster and in consecutive times. 
Unfortunately, this is not the case if we consider the entire graph G (see for 
instance the example of Figure Q). Since after Step 2, there are no consecutive 
0-arcs, we avoid infeasibility, since locally -between two consecutive levels- we 
can always execute the tasks that are connected by 0-arcs (in Step 2 we do not 
add any new 0-arc), and globally there are no more consecutive 0-arcs (hence, 
we have the time to change cluster and communicate, if necessary). 

4 Relative Performance of the Heuristic 

First, we prove that 8/5 is an upper bound of p^. Then, we show that this value 
is reached for a special class of graphs. 



4.1 Upper Bound of the Relative Performance 

Let us denote by t/ the starting time of the task i determined by the heuristic 
and by t* the starting time of the task i given by the linear program (t* is the 
longest path from a source to the task i including the processing time of the 
tasks and the real values of the corresponding arcs). 

Lemma 3. For every task i G V,t^ < |t* 

Proof. We use induction to prove it. 

The inequality is true for every task i G Z (i.e. tasks such that = 0) and 
for every task k such that P~{k) C Z. 

Let us now assume, that the lemma is valid for all the predecessors of the 
task i. 

Let Ai be the set of the critical incoming arcs (i.e. the arcs having i as terminal 
vertex and belonging to a critical path). We have to consider the following cases: 

1. One of the arc(s) of Ai denoted by (j, i) is valued to 0 {xji = 0, which means 
that Cji < 0.25). So t’) = t’- -\- 1, and t* >t* -\- 1. According to the induction 
hypothesis we have t^ < 

Thus ti < + 1 and consequently t/ < |(t* — 1) -k 1 < ft*. 

2. One of the arc(s) of Ai denoted by (j,i) is valued to 1 {xji = 1) and eji > 
0.25. So, t^ = t^-\-2, and t* > t*-|-1.25. According to the induction hypothesis 
we have t') < ft*. 

Thus < ft* -k 2 and consequently t/ < |(t* - 1.25) -k 2 < ft*. 
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3. Ai contains an 1-arc denoted by (j,i) such that Cji < 0.25. We have = 
+ 2, and t*>t* + 1. 

Notice that, the value associated to this 1-arc has been transformed to 1 
after the “study” of task j. 

So in the set A,- it exists a 0-arc. W.l.o.e. we denote by (k, j) this arc, thus 
= 4 + 1, and t*>tl + 1. 

According to the induction hypothesis we have 4 ^ ^ 4 - Thus, 4 = 4 
and t* >4 + 2. 

Hence, we get 4 ^ ^4 + and consequently, 4 — |(^i ~ 2) + 3 < |t*. 



Finally, we obtain our main result: 



□ 



Theorem 1. The relative performance of our heuristic is bounded above by 
8 

5 ■ 

Proof. Let us denote by the makespan of the schedule computed by the 

heuristic and by the optimal value of a schedule. 

Let us consider a task i of U such that + 1. Then, according to 

Lemma El eta. < U4 + !)• Moreover, f* + 1 < 0“/ and 0“/ < so we 

get the theorem. 

□ 



4.2 Tightness of the Bound 

We recursively define a sequence of graphs Gi,i>l based on the graphs Bi and 
B 2 given in Figures 3 and 4 respectively. The values near each task in Figures 3 
and 4 correspond to its starting time. 

We compute the value of the makespan obtained by our heuristic, denoted 
Ctaxi^i), and we propose a schedule a such that: 

CtaAG.) ^ 8 

e?naxiGi) 5 

Notation: In what follows, whenever we write: ^^Bi@Bf \ we will consider the 
graph obtained by the concatenation of the graph Bi and the graph B 2 , in which 
we identify the tasks of the last level of Hi with the tasks of the first level of B 2 . 
More precisely, we have x[ = ri and y[ = r 2 . 

The definition of Gi-i@B 2 is made in a similar way (the tasks of the last 
level of Gi-i are aggregated with the tasks of the first level of B 2 i.e. qi = ri 
and q 2 = r 2 with qi,q 2 G V{Gi-i) and ri,r 2 G V{B 2 )). 

Gi is recursively defined in the following way: 

• Gi = Bi@B2. 

• and Gi = Gi-i@B 2 , with i >2. 
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Xi Yi Zj 




Fig. 3. The graph B\ and the 
associated schedule a{Bi). 



ri 




Fig. 4. The graph B 2 and the 
associated schedule a{B 2 ). 



Makespan for Scheduling the Tasks of Gi 

Lemma 4. The makespan for the graph B\ (resp. B 2 ) obtained by the heuristic 
IS equal to = 7 (resp. C(^ax{B 2 ) = 9/ 

Proof. The solution of the relaxed linear program will assign the value 0.25 to all 
the arcs of i = 1,2. Thus, during the first step, the heuristic will transform 
all these values to 1, and hence the makespan will be equal to 7 (resp. 9). 

□ 



Lemma 5. The makespan for Gi given by the heuristic is equal to C!f^^,^{Gi) = 
8i + 7. 

Proof. By induction on i. 

• If i = 1 the lemma is valid: = C((^,,(Bi) + G(f^^{B 2 ) - 1 = 15. 

• We assume that the lemma is valid for i — i.e. = 8(i — 

1) 7. 

We have = CL.(G,-i)+G^,,(i? 2 )-l = 8(t-l)+7+9-l = 8t+7. 

□ 

Let us now construct a better schedule that we call a. We built a recursively: 

• cr(Gi) is obtained by concatenating the schedules of B\ and B 2 (see Figures 
0and0) taking of course into account the aggregation of the tasks in Gi = 
B\@B 2 (given that x'^ = r\ and y[ = V 2 , we have = Gi and tyj = Gj ). 

• Similary, a{Gi) is obtained by concatenating carefully the schedules of Gi_i 
and of B 2 (taking again into account the aggregation of tasks) . 

Lemma 6. The makespan for the graph B\ (resp. B 2 ) obtained by a is equal to 
G-„,(Gi) = 5 (resp. G!f,M) = 6/ 
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Proof. It is obvious by the construction. 

□ 



Lemma 7. The length of the schedule a for the graph Gi, is equal 

to + 5. 

Proof. By induction on i. 

• If j = 1 the lemma is valid {G!f^^^{Gi) = 10). 

• We assume that the lemma is valid for z — l,j > 2, i.e. G^g_^{Gi-i) = 5{i — 

l) + 5. Since = 6, we obtain C^^^(Gi) = 5(z— 1)+5+6 — 1 = 5i+5. 

□ 



Theorem 2. The bound ^ is reached for Gi. 

Proof. By the Lemmas El and □ we have C^^^(Gi) = 8i + 7 and G^^^(G,) = 
5i + 5. 

So, , . , 

CtaJG^) ^ ^ ^ 8 ^ 

C'maa:(G'0 5z 5' 

□ 



5 Extended Model 

In this section, we consider an extension of the studied problem where 
each cluster contains m identical processors, with to > 1 a fixed constant 
(L*, TTzIpreC, {Cij ^ Cij') — (1, 0) , — 1 1 Cmaxf 

In order to treat this problem we consider a generalization of the integer linear 
program presented in Section 0 We have to extend the two cases of Figure 0 by 
considering the cases given in Figure El 





Fig. 5. Two extended cases. 
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— For the case (a): 

For every i even, with 0 < z < 2m — 2, such that (z + 1, z) G E, and z odd 
with 1 < z < 2m — 1 such that (z, i+ 1) £ E, 

^ “I" ^ ^ 1- 

0<i<2m— 2 l<i<2m— 1 

i even j odd 

— For the case (b): 

For every z even, with 0 < z < 2m — 2, such that (z + 1, z) G E, and z odd 
with 1 < z < 2m — 1 such that (z, i+ 1) £ E, 

^ > 1 . 

0<i<2m— 2 l<i<2m— 1 

i even j odd 

Thus, in what follows, we consider the following integer linear programming 
(ILP) problem: 



{ni) 



min Cmax 



V(z, j) G E, 




Xij £ { 0 , 1 } 






Mi £ V, 




U>0 






V(i,j) £ E, 




+ 1 + Xij ^ tj 






M{i,j) £ E- 


u, 


Xij>\r+{i)\- 


- m 












M{iJ) £E- 


z, 


M 

H 

IV 


'Wl- 


— m 












for the cases 


(a) 


^( 2 + 1)2 


+ 


E 






0<i<2m-2 


1< 


2<27TI— 1 






i even 




i odd 


for the cases 


(b) 


^2(2+1) 




E 






0<i<2m — 2 


1< 


2<2m-l 






i even 




i odd 


Mi £ V, 




^2 H” 1 ^ Cmax 







We use the same heuristic with two steps except that for the Step 1, the 
rounding is: if eij < then Xij = 0 , otherwise Xij = 1 . 

Notice that, in the case of m = 1 , the Step 2 of the algorithm in Section Elis 
useless. 

Lemma 8. For every job i £ V,t^ < 2 — 2 m+i ^i 

The proof of Lemma El is similar to the proof of Lemma El by replacing | by 
2 - 2^+1 and 0.25 by ^ and 1.25 by 1 + 5 ^. 

Finally, we obtain the following result: 
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Theorem 3. The relative performance of our heuristic is bounded above by 
2 — 2 m+i bound is tight. 

Thus, in the case of to = 1 we get the relative performance for the 
P\prec] Cij = l;pi = l\Cmax problem given by Munier and Konig |B|. 

6 Conclusions 

In this paper, we gave an approximation algorithm for P, 2|prec; (c^ , e^ ) = 
(1,0); Pi = l\Cmax, with relative performance equal to |. Recall that there is no 
hope to find a heuristic with relative performance guarantee less than | (unless 
V = NV) 0. Our approach is also extended for the more interesting, from a 
practical point of view, problem P , m\prec; (cij , Cij) = (l,0);pi = l\Cmax, i-e. 
for the case where the number of processors in each cluster is any fixed constant. 
In that case, the performance ratio is a function of the number of processors. 

It would be interesting to extend the heuristic presented here in the case 
where the number of clusters is limited or to develop other heuristics improving 
the obtained bound. Another extension of this work will concern the well known 
small communication case 0. 
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Abstract. We investigate the multiprocessor multi-stage open shop and 
flow shop scheduling problem. In both problems, there are s stages each 
consisting of a number rrii of parallel identical machines for 1 < i < s. 

Each job consists of s operations with one operation for each stage. The 
goal is to find a non-preemptive schedule that minimizes the makespan. 

We propose polynomial time approximation schemes for the multipro- 
cessor open shop and flow shop scheduling problem when the number of 
stages s is constant and the numbers of machines rrii are non-constant. 

1 Introduction 

Problem Definition. A flow shop (or open shop) is a multi-stage production 
process with the property that all jobs have to pass through the stages. For flow 
shops the order in which the jobs pass through the stages is the same, whereas 
for open shops the order is immaterial. There are n jobs Jj, with j = 1, . . . , n, 
where each job Jj consists of s operations Oij, . . . , Ogj. The operation Oij, with 
f = 1, . . . , s, has to be processed at stage i of the production process, and pij is 
the processing time or length of operation Oij . 

In the classical open and flow shop problem, there is only one machine avail- 
able for each stage. In the multiprocessor open and flow shop problem, for every 
stage i there are rrii identical machines available that can process operations 
in parallel. Since more than n machines on a stage are not necessary, we may 
assume that rrii < n. At any time step, every job is processed by at most one ma- 
chine and every machine executes at most one job. We assume that preemption 
is not allowed, i.e. once an operation is started, it must be completed without 
interruption. The goal is to And a schedule that minimizes the makespan C'max, 

* This work was done while the first author was associated with the research instutute 
IDSIA Lugano and supported in part by the Swiss National Science Foundation 
project 21-55778.98, ’’Resource Allocation and Scheduling in Flexible Manufacturing 
Systems” . 
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that is the maximum completion time among all jobs. The minimum makespan 
among all schedules is denoted by 

Following the three-field notation scheme m, the makespan minimization 
problem in a classical open and fiow shop with s stages is denoted by Os\\Cmax 
and Fs\\Cmax (or 0\\Cmax and F\\Cmax depending on whether the number 
s of stages is constant or not), respectively. The makespan minimization in 
a multiprocessor s-stage open and fiow shop is denoted by Os{P)\\Cmax and 
Fs{P)\\Cmax (or 0{P)\\Cmax and F{P)\\Cmax), respectively. 

Complexity Results. Gonzales and Sahni j^j proved that Os\\Cmax is NP- 
hard in the weak sense, and Williamson et al. ng showed that 0\\Cmax is NP- 
hard in the strong sense. But it is not known whether the problem Os\\Cmax 
allows a pseudopolynomial time algorithm or whether the problem Os\\Cmax is 
NP-hard in the strong sense. On the other hand, Garey et al. ^ showed that 
F‘i\\Cmax is strongly NP-hard, and Hoogeveen et al. 0 proved that 
F2{P2, Pl)\\Cmax (with two stages, two machines on the first stage and one 
machine on the second stage) and F2{P1, P2)\\Cmax are already strongly NP- 
hard. If we have only one stage s = 1, then we have the classical strongly NP-hard 
scheduling problem P\\Cmax of independent jobs on identical machines P|. 

A polynomial time approximation scheme (PTAS) for a (minimization) op- 
timization problem P is an algorithm that given any constant value e > 0 finds 
in polynomial time a solution of value no larger than 1 -|- e times the value of 
an optimum solution. A fully polynomial time approximation scheme is an ap- 
proximation scheme that runs in time polynomial in the size of the input and 
1/e. 

Approximability Results. If the number s of stages is part of the input, 
Williamson et al. HH proved that the existence of an approximation algorithm 
with worst case ratio <5/4 for the problem 0\\Cmax or F\\Cmax would imply 
P = NP. On the positive side. Hall 0 and Sevastianov and Woeginger H2I 
have proposed polynomial time approximation schemes (PTAS) for Fs\\Cmax 
and Os\\Cmax, respectively. Both PTAS’s can be generalized to the case where 
the number of stages and number of machines per stage are all constant. Further- 
more, Hochbaum and Shmoys Q have given a polynomial time approximation 
scheme for the problem with one stage that is equivalent to P\\Cm.ax- 

Ghen and Strusevich ^ have developed an approximation algorithm for 
0{P)\\Craax (the multiprocessor open shop problem) with worst case ratio 2 + e. 
For 02{P)\\Cmax^ they have derived a worst case ratio of 2 — 2/m^ where 
m = max(77ii, TO 2 ) > 2. Schuurman and Woeginger 11 II have found an ap- 
proximation algorithm for 0{P)\\Cmax with improved worst case ratio 2. Fur- 
thermore, a (3/2 -I- e) - approximation algorithm for the problem 02{P)\\Cmax 
(with two stages) is given in |l Ij . The existence of an approximations scheme 
for Os{P)\\Cmax with constant s > 2 number of stages and arbitrary number of 
machines per stage was posed as an open problem by Schuurman and Woeginger 

ITU- 

Several approximation algorithms have been studied for the two - stage mul- 
tiprocessor fiow shop problem, see e.g. in rinni- The best result, a polynomial 
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time approximation scheme for F2{P)\\Cmax is given in | I'i) . Determining the 
approximability behaviour of Fs{P) \ \Cmax with constant s > 3 number of stages 
and arbitrary number of machines per stage was posed as an open question in a 
paper by Hall 

New Results. In this paper, we propose polynomial time approximation 
schemes for both the multiprocessor open shop and flow shop scheduling problem 
when the number of stages s is constant and the numbers of machines rrii on 
stage 1 < i < s are part of the input. For open shops, this improves even for 
two stages the best previous known result of 3/2 + e in 1 1 I ) . Furthermore, we 
answer the open question by Schuurman and Woeginger in] for Os{P)\\C^ax 
and the open question by Hall |S| for Fs{P)\\Cmax- Notice that we can not 
expect a fully polynomial time approximation scheme (since both problems are 
strongly NP-hard even for a constant number of stages), unless P=NP. 

Interestingly, we do not use linear programming. In our approach, we use 
dynamic programming combined with several ideas from Hall j2], Hochbaum and 
Shmoys 0, Schuurman and Woeginger and Sevastianov and Woeginger 

m- 

2 Restricted Problem 

Let Li = average load on stage i, and let = 

maxi<j<jiPy be the maximum processing time of any operation on stage i. 
For the multiprocessor flow and open shop problem, we can derive the following 
bounds for the minimum makespan 

max^ max{Li,p|'"““^} < 

i=l 

The lower bound should be clear. The ideas for the upper bound are that we 
can use list scheduling to obtain a schedule with makespan at most + (1 — 
p{max) < ^^ _|_ Jq], operations on stage i and that we can schedule 

all stages one after another. By dividing all processing times by 2smaxi<i<s 

max I Li, Pi I we get an instance I such that ^ < 1. In the rest 

of the paper we assume without loss of generality that the minimum makespan 
C^ax satisfies the inequalities ^ < 1. 

Following for real numbers At, i5 > 0 we define three sets of operations: 

P — ^ 

M = {O^jlSn < p^j < At}, 

^ = {Otj\pij < Sk}. 

We will assume that J < 1 and that is an integer; we will choose S later 

in Section 4. The operations in B are called big operations, the operations in M 
are called medium operations and operations in S are called small operations. 
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In the following, we consider an optimum schedule S of length < 1. We 

denote with the starting time of operation Oij in S. We partition the schedule 
S into intervals of length Sk and increase each cut (between two intervals) by 
an interval of length 26^^^k. Using such enlarged cuts, the processing times of 
the big and medium operations are increased. We use the following convention: 
If we cut through the endpoint of an operations or in the middle, then we will 
enlarge the processing time of the operation by 25^^^ k. If we cut through the 
startpoint of an operation, then we delay the starting time by 2S^^^k. Let S be 
the enlarged schedule, and let fy be the starting time of Oij in S. 

Since the number of cuts is at most |"^] — 1 < ^ and each cut is replaced 
by an interval of length 26^^^ k, the total length of the enlarged schedule S can 
be bounded by 

If is small enough (e.g. 2(5^/^ < ^ or equivalent 6 < ), then we get 

an additive factor of at most ^ < f C'^ax- 

Then, each medium operation Oij is cut at least once (i.e. Oij lies in more 
than one interval of length Sk). This implies that the processing time of each 
medium operation Oij is increased by a factor of at least 25^^^k. In other words, 
there is now a time window of size > pij+26^^'^ k where we can shift operation Oij 
(with length pij ) without generating conflicts between Oij and other operations 
Oi'j, i yf i'. The time window of Oij has the form [fij,fij + pij + 2xijS^^^K] 
with Xij € M and Xij > 1. We can shift each medium operation Oij in its 
corresponding time window such that it starts at a time rb which is a multiple 
of Furthermore, we can increase the processing time pij of Oij to p)j such 

that Pij > Pij is a multiple of S^^^k. 

For a big operation Oij G B, the situation is similar. Since pij > k, Oij is cut 
at least times and the corresponding time window is increased by a factor of 
at least ^25^/"^ k = 26^^^k. Therefore, we can assume that a big operation starts 
processing at a time rb which is a multiple of 5^/"^k and has processing length 
p'ij > Pij which is also a multiple of S^I'^k. 

We also round the processing times of small operations up to the nearest 
multiplies of — — . Since in any critical path there are at most ns operations 
then the length of the schedule will not increase considerably (we add at most 
2(5^/^ to the length of the schedule). In the following, we will consider only 
restricted schedules with the properties: 

— each medium operation Oij G M starts at a time which is a multiple of 
S^/"^K, 

— each medium operation Oij G M has processing time pF = S’^l'^nlij with 
lij G M, 

— each big operation Oij G B starts at a time which is a multiple of S^I'^k, 

— each big operation Oij G B has processing time p[j = S^^’^ntij with 1^ G M, 

— each small operation Oij G S has processing time pF = with £ij G M. 
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The processing times satisfy the conditions > pij for Oij G BLiMUS, p[j — 
Pij < for Oij G M a,nd p^j—pij < for Oij G B. Using the fact G 

1 ] and the choice of 6 above, the optimum restricted schedule has length at 
most < (l + f)C'^aa: ^ 2 (for e < 2). In the next Section we present 

an algorithm to compute restricted schedules based on dynamic programming. 



3 Computation of Restricted Schedules 

3.1 Starting Intervals of Operations 

We have two types of intervals in our restricted schedule (of total length < 2): 

— intervals of the first type with length 

— intervals of the second type with length 

Since <5 < 1, the intervals of the second type are smaller. In fact, an interval 
of the first type consists of l/<5 many intervals of the second type. We assume 
that and p = jrpr^ are integer numbers (see also Section 4). The number p 
gives an upper bound on the number of intervals of the first type. The number 
of intervals of the second type is bounded by ^ 3 ^ 2 ^ < The goal is to find a 
restricted schedule with a minimum number of used intervals of the first type. 

For each job J^, we use vectors Kj = (n, . . . , t*) of intervals of the second 
type for the operations with Tj G {0, . . . , j — 1} for 1 < f < s (to indicate the 

starting interval) . If Oij € B U S, then we require that the Ti is even a multiple 

of j-. These assumptions for big operations arise directly from the restricted 
schedules. A big operation Oij G B starts only at the beginning of an interval 
of the first type. A medium operations Oij G M starts at the beginning of an 
interval of the second type. Therefore, we can use integers for the values r,. For 
a small operation Oij G S the condition Vi is a multiple of j-’ means that we fix 
an interval of the first type where Oij has to be started. The number of different 
vectors Kj is bounded by a constant 0((j)®). A vector Kj = (ti,...,Ts) is 
feasible in the flow shop environment, if and only if: 

(1) Ti <T2 < ... <Ts 

(2) the processing intervals for operations Oij G B and O^j G B LI M LI S (or 

Oij G M and Oiij G M), i ^ i' do not intersect, 

(3) if Oij G M, Oi'j G S and i < i' , then O^j has to start in the interval (of 

the first type) where Oij ends or afterwards, 

(4) if Oij G S and O^j G M (and i < i'), then Oij must start in the interval 
(of the first type) where Oi'j begins or before, i.e. Ti < 

In general, these properties do not give directly a feasible schedule: two small 
operations (or a small and medium operation) may be executed at the same time 
or in the wrong order. On the other hand, each job in a feasible restricted schedule 
satisfies these conditions. We show later how we generate a feasible approximate 
schedule from such a solution. Similar conditions about feasible combinations 
can be given also for open and job shops. 
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3.2 Load and Configuration 

The load Lg^i is the total sum of processing times pij of small operations Oij 
assigned to start in interval g (of the first type) on stage i. Since in an optimum 
restricted schedule, a small operation may be executed in two consecutive inter- 
vals g and 5 -I- 1, the maximum allowed load Lg^i is at most n + 5 k) for 

0 < 5 < M ~ 1- Using rrii < n and (5 < 1, the values Lg^i are bounded above by 
2n6^^^K. We have added the value 5k for the last small operation on each of the 
rui machines. Later, we will observe that some of these mi machines can not be 
used for a small operation. For instance, a big operation can cover several inter- 
vals of the first type. Therefore, we can place a small operation Oij € S only onto 
a machine (on stage i) with at least one idle period. Since the processing times of 
small operations are multiplies of , we have only discrete and at most O(n^) 
different load values for an interval (here we use the bounds Lgj < 2n5^^^K,). 
We store in our computation a load vector L = (Lo,i) • ■ • ) Uo,s, . . . , The 

maximum number of different load vectors is 

The key observation is that we can have only a constant number of big 
and medium operations on each machine. The maximum number is at most 
since < 2 and the processing times pL > pij > (5k for Oij G i? U M. 

Furthermore, there is only a constant number < ^ 3 %^ ~ f starting times 
for big and medium operations (or intervals of the second type) in a restricted 
schedule. A schedule type is described by a set of intervals of the second type 
(with length 5^^^k) where big and medium operations must be processed. We 
have only a constant number T < 2^ of different schedule types. Let Si, ... ,St 
be the different schedule types in a restricted schedule. The first schedule type 
Si = 0 corresponds to a free machine (without any assigned big or medium 
operation). For each stage t, 1 < t < s, in the dynamic program we compute a 
vector = (a^*\ . . . ,a^^) where > 0 gives the number of machines with 
schedule type Sk on stage i. Notice, that X)fc=i ^1* ^ 

A configuration given by s vectors . . . , describes completely the set 
of intervals on each stage which are occupied by big and medium operations and 
the set of intervals which may be filled by small operations. Notice also that a 
configuration does not give an assignment for the big and medium operations. 
We only fix intervals for processing big and small operations. The total number 
of different configurations is bounded by (maxi<i<s 



3.3 Dynamic Programming 

In this section we present a dynamic programming algorithm for computing 
restricted schedules. Recall that a restricted schedule may not be feasible for the 
original shop problem. In the Section 4 we give an algorithm for converting a 
restricted schedule into feasible one. 

Our algorithm consists of two phases. In the first phase we compute a ta- 
ble with elements (a^^\ . . . , L, k). Each element contains an answer on the 
question: ’Is there an assignment of starting times Ki, . . . , to jobs Ji, . . . ,Jk 
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and an assignment of machines to the operations of these jobs such that big and 
medium operations are processed accordingly to the configuration 
and the total length of the small operations assigned to interval g (of the first 
type) on stage i is Lg^i?’ We will show in the next section that we can obtain 
a near optimal solution using such an assignment. Notice also, that if there is 
a restricted schedule of jobs Ji,. . . ,Jk with configuration . . . , and load 
vector L, then the corresponding element . . . , L, k) must contain an- 
swer ’Yes’. In general, the converse statement does not hold. The number of 
elements in the table is bounded above by Notice also that (in 

the first phase) we do not compute the restricted schedules explicitly. 

In the second phase we compute an element (a^^\ . . . , L, n) which con- 
tains ’Yes’ with feasible load vector L (see below) and minimum makespan, i.e. 
with a minimum number of intervals of the first type used in the configuration 
and load vector. Let Ig^i be the total load in interval g of the first type on stage 
i reserved for big and medium operations (we can simply compute Ig^i from the 
vector of schedule types a^*^). Furthermore, let rug^i < nii be the number of 
machines with at least one idle time between operations from B U M in interval 
g on stage i. Then a load vector is feasible if + mg^iSn — Ig^i for 

all g and i (we have added mg^iSn since some small operations may start at the 
very end of an interval) . Since the number of elements in the table is polynomial, 
we can find a feasible element with smallest makespan (number of intervals) in 
polynomial time. 

After that we can simply compute an assignment of starting times to jobs 
and an assignment of machines to operations using an iterative procedure. Given 
an assignment of starting times ATi, . . . , AT„ and vectors Ti, . . . , T„ (of indices of 
schedule types) we can compute a restricted schedule (including an assignment 
of machines to the big and medium operations) with almost the same makespan. 
After that we convert the obtained restricted schedule into a feasible schedule 
of original shop problem (see next Section) . 

4 Transformation into a Feasible Schednle 

4.1 Multiprocessor Flow Shop 

Let us consider a machine M on stage i and an interval g (of the first type). 
The first step is to shift all medium operations (which start in this interval) 
consecutively to the left side of interval g on M (only the last operation will not 
be shifted if it is executed also in interval g-l- 1). Using such a transformation we 
may generate new infeasibilities between operations of the same job (on different 
stages). But using this transformation, for every machine M we have at most 
one time interval in interval g where M is idle. 

In the second step, we place all small operations into these gaps. Remember 
that the load Lg^i plus the (partial) processing times of medium operations 
assigned to interval g (on machines of stage i with at least one idle period) 
is at most rfigglp -^’^ k + Sk). If we increase the length of the interval by 26 k, 
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then we can place all small operations completely and without interruptions 
by a greedy algorithm into the rrig^i idle periods of the enlarged interval. The 
generated schedule is a restricted schedule which is in general not feasible for 
the original shop problem. 

Since the number of intervals of the first type is at most fi — 2/((5^/^k), the 
total length of the schedule is enlarged by at most ii-25k = 4(5^/^. The last step is 
now to delay all machines on stage 1 < i < s by a time of {5^^^K+25K){i — l). This 
idea called sliding was also used by Hall |Sj for classical flow shops. Since there are 
infeasibilities only between operations of the same job which are processed in the 
same interval of the first type, (using this sliding) we get a feasible schedule for all 
jobs and the length of the schedule is increased by at most {S^^^k + 2Sk){s — 1) < 
3(s — Notice, that the computed restricted schedule has only a minimum 
number of used interval of the first type and that the load of the small jobs 
(in the last interval) could also increase the makespan in comparison to the 
optimum restricted schedule. Therefore, we have to take into account the length 
+ 2Sk < of the last interval. Finally, we have to add also the value 

caused by the consideration of only restricted schedules. 

It remains to choose S and k and to bound the additive factor 3si5^ 

Using K = 1 (so we do not use big operations for the multiprocessor flow shop 
problem), we get the additive factor (3s + 4)5^^^. Using 5 < ( 2 s( 3 s+ 4 ) 

(3s + 4)(j^/^ < ^ Therefore, we define 



6 = 



1 



2s(3s+4) 



2 

< 



( ^ 

V2s(3s + 4) 



2 



and have the property that is integral. Since k = 1, the number p = 
is also integral. 



4.2 Multiprocessor Open Shop 

The first step for this problem is to And a value for k such that the average loads 
of medium operations on each stage are bounded by a constant < ^ < 
or by a smaller constant a < ^. We define a sequence of blocks as follows: 

Bo = {Otj\pij > ( 5 }, 

Bi = {0^j \5'^ < Ptj < i5}, 

Bt = <P^J <S*}, 

with f > 1 where i5 is a value that depends on e and s. We will choose 6 later. The 
average load LfBt) of operations in Bt on stage i is given by eSt 

Since 

S S 

i—1 i>l i—1 
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there exists a constant k with 1 < fc < [i] such that the total average load 
Si=i Li{Bk) < a. By contradiction, we suppose that > a for every 

t = 1, . . . , [i] . Then, we obtain with 

Q, 



a contradiction. 

Using this idea (that is a generalization of the idea in m) and a chosen small 
enough, we get a block Bk, k > 1 with average load on each stage bounded by 
a < §C^ax- We define M = Bk, S = U£>fc+i ^ = V\i<k and k = 5^. 

In the multiprocessor open shop problem, all medium operations can be 
shifted to the end of the schedule. Then, we can apply the algorithm in HU on the 
medium operations. It generates a so called dense schedule, and the makespan 
of the generated schedule can be bounded by 

^ Bmax + max Li{Bk) 

Ki<s 



where 

Bmax = max pu < SK= s6'". 

i\l<i<s,OijGM 

Therefore, the makespan for the medium operations can be bounded by < + 

a < S(5 + ^ (here we have used k>l and (5 < 1). 

Using this shifting of the medium operations, we have to place only the small 
operations into the intervals of the first type. The maximum (rounded) load of 
the small operations for interval g and stage i is < k + Sk) where 

< rrii is the number of machines with at least one idle period. We increase 
the length of each interval to S^^‘^k+ (s + 1)(5k. As consequence, for each interval 
g we have generated a smaller instance of the multiprocessor open shop problem 
with 



— rug^i machines on stage i, 

— maximum processing time Pmax of an operation < Sk, 

— total load of small operations (on stage i) < m.g^i{S^^‘^K + Sk). 

Using the algorithm in we can generate a schedule for such an instance of 
length at most 

{sSk) + {S^^'^K + Sk) < S^I'^K + (s + 1)5 k. 

Therefore, the enlargment among all intervals of the first type can be bounded 
by ^ 1 / 2 ^ (s + 1 ) i 5 k = 2(s + The last interval generates in the worst case 

an additional length (in comparision to the minimum restricted schedule) of 
S^^'^K + (s + 1 ) i 5 k . Furthermore, by the consideration of only restricted schedule 
we obtain an additional length of 2(5^/^. Using k = < 1, we get a total 
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additional length (with length added by medium operations) < + ^ where 

C < 4s + 6 is a constant independent of 6. We choose S < and therefore 

+ S < ^ < eCZa.- We define 




and get the property that is integral. Using k = 5^, the number /i = 
is also integral. 
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Controlled Conspiracy-2 Search* 

(Extended Abstract) 
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Abstract. When playing board games like chess, checkers, othello etc., 
computers use game tree search algorithms to evaluate a position. The 
greatest success of game tree search so far, has been the victory of the 
chess machine ’Deep Blue’ vs. G. Kasparov, the best human chess player 
in the world. 

When a game tree is too large to be examined exhaustively, the standard 
method for computers to play games is as follows. A partial game tree 
(envelope) is chosen for examination. This partial game tree may be any 
subtree of the complete game tree, rooted at the starting position. It is 
explored by the help of the a/3- algorithm, or any of its variants. All a/3- 
variants have in common that a single faulty leaf evaluation may cause 
a wrong decision at the root. 

To overcome this insecurity, we propose Cc2s, a new algorithm, which 
selects an envelope in a way that the decision at the root is stable against 
a single faulty evaluation. At the same time, it examines this envelope 
efficiently. We describe the algorithm and analyze its time behavior and 
correctness. Moreover, we are presenting some experimental results from 
the domain of chess. 

Cc2s is used in the parallel chess program P.ConNerS, which won the 
8*^ International Paderborn Computer Chess Championship 1999. 

Keyword: Algorithms and Datastructures 



1 Introduction 

Some games have been proven to be PSPACE-complete. As a consequence, we 
cannot do anything better than to examine a complete game tree when we want 
to find a perfect decision or if we want to know the value of the starting situation. 
For most of the interesting board games we do not know the correct values of all 
positions. Therefore, we are forced to base our decisions on heuristic or vague 
knowledge. An approximation is done by the following method. 

First of all, a partial game tree is chosen for examination. This subtree may 
be a full-width, fixed-depth tree, or any other subtree rooted at the starting 
position. We call this subtree an envelope. Thereafter, a search algorithm assigns 

* This work was supported by the DFG research project “Selektive Suchverfahren” 
under grant Mo 285/12-3. 
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Fig. 1. Only the envelope is examined by a search algorithm. 



heuristic evaluations to the leaves and propagates these numbers up the tree by 
the minimax principle. Usually the envelope is examined by the help of the a/3- 
algorithm |H], or the MTD(f)-algorithm |p. As far as the error frequency at the 
root is concerned, there is no difference, whether or not the envelope is examined 
by the a/J-algorithm or by a pure minimax algorithm. The result is always the 
same, only the effort to get the result differs drastically. 

The approximation of the real root value by the help of fixed-depth envelopes 
leads to good results. Nevertheless, there have been found several enhancements 
that form the envelope more individually. Some of these techniques are domain 
independent like Singular Extensions |2], Nullmoves PI m or Fail High Reduc- 
tions [Z]- Many others are domain dependent. The form of the envelope strongly 
determines the quality of the search result. 

We distinguish between two classes of game tree search algorithms. On the 
one hand there are those which are built to determine the minimax value of an 
envelope. The a/3-algorithm, the SCOUT-algorithm ^5] or SSS* have 
been exhaustively examined in the last 30 years. 

A different class is that of the incremental searching algorithms m which 
’grow’ the search tree one step a time. At each step a leaf of the current tree is 
chosen (selection), and the successors of that leaf are added to the tree (expan- 
sion). The new leaves are evaluated and the new heuristic values are updated 
bottom up (update). In contrast to e.g. the a/3-algorithm, these algorithms need 
linear space in the number of searched nodes. The advantage, however, is that 
the grown trees need not be of uniform depth and the envelopes need not be 
determined before the search is finished. Examples of such iterative techniques 
are the Berliner’s B* algorithm |^, Palay’s probability-based method m, and 
Conspiracy Number Search. Conspiracy Number Search has been introduced by 
D. Me Allester J. Schaeffer has interpreted the idea and has developed 
a search algorithm that behaves well on tactical chess positions. Lorenz et al. 
HH have presented first ideas of how to build an algorithm which is able to do 
the same job more efficiently. 

The startup point of Conspiracy Number Search (CNS) is the observation 
that, in a certain sense, the a/3-algorithm computes decisions with low security. 

The changing of the value of one single leaf (e.g. because of a fault of the 
heuristic evaluation function) can change the decision at the root. Thus, the 
a/3-algorithm takes decisions with security (i.e. conspiracy) one. 
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The aim of CNS is to distribute the available resources in a way that it is 
guaranteed that decisions are made with a certain conspiracy c > 1. This means 
that the decision is stable against up to c — 1 changes of leaf-values. Schaeffer’s 
algorithm manages this by the help of conspiracy vectors at each node of the 
game tree. 

These vectors inform on how many nodes must change their values in order 
to change the minimax value of the root to x. As all collected pieces of infor- 
mation must be available at any time, the memory requirement of the method 
is determined by the number of examined nodes and by the granularity of the 
evaluation function. 

This enormous memory consumption is one of the reasons why the use of 
CNS has been restricted to tactical positions. With the help of a coarse-grained 
evaluation function one tries to find decisions which are clearly better than all 
other alternatives. For tactical positions CNS has been shown to be superior to 
fixed-depth full-width searches HS|. 

General inputs do not fulfill the demand that such clearly superior decisions 
are available. The searching on such instances is called strategic search. There, 
it is important to come to a decision even when it is only marginally better than 
the other alternatives. For this purpose, one needs a fine-grained evaluation 
function, and thus a large amount of memory. It is disappointing to see that the 
conventional CNS gets severe problems with its termination when the evaluation 
function is of fine granularity. The CNS sometimes examines large subtrees, only 
in order to find a decision with low security. In conclusion, these deficiencies have 
lead to the fact that CNS could not successfully be implemented for general 
problem instances. 

In this paper we discuss a more carefully directed search procedure, which we 
call Controlled Conspiracy 2 Search (CC2S), and which solves these problems. 

1.1 Organization of This Paper 

In this paper we are presenting a description of the Controlled Conspiracy 2 
Search algorithm. In order to come to a decision at the root, we must determine 
a lower bound on the value of the best successor and upper bounds on the values 
of all the other successors of the root. The aim is to base the result on envelopes 
the leaves of which have a distance of at least t to the root, and which contain 
at least 2 leaf-disjoint proving-strategies for each of the bounds. (Remark: An 
error analysis in game trees HD! have lead us to the assumption that ’leaf-disjoint 
proving strategies’ are a key-term in the approximation of game tree values.) 

In section 2 we present some basic and general definitions and notations. Sec- 
tion 3 describes the Cc2s-algorithm. We compare it to the a/3-algorithm and to 
conventional incremental algorithms. The following properties can be observed, 
a) If the algorithm terminates the result will be based on envelopes with the 
desired properties, and the outcome is based on minimax values. Thus every 
minimax-based search algorithm would come to the same result if it examined 
the same envelope, b) When we examine predefined and finite envelopes (as usu- 
ally done in the analysis of the a/3-algorithm) our algorithm terminates in finite 
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time, c) At its best case, the new algorithm will examine the minimal number 
of nodes in order to find a decision, if we slightly change it in a way that it is 
comparable to the a/3-algorithm. 

Section 4 deals with experimental results from the domain of chess. 

2 Definitions and Notations 

2.1 General Definitions 

Definition 1. In this paper, G = (T, h) is a game tree, where T = (V, K) is 
a tree (V a set of nodes, K C V x V the set of edges) and h \ V Z is a 
function. L{G) is the set of leaves of T . r{v) denotes the set of successors of a 
node V. 

Remark: We identify the nodes of a game tree G with positions of the un- 
derlying game and the edges of T with moves from one position to the next. 
Moreover, there are two players MAX and MIN. MAX moves on even and MIN 
on odd levels. We call the total game tree of a specific game the universe. 

Definition 2. Let G be a game tree. A subtree E of G is called an envelope if, 
and only if, the root of E is the root of G and a node v of E either contains all 
or none successors of v in G. 

Definition 3. Let G = (T, h) be a game tree and v € V a node of T. The 
Minimax Value, resp. the function minimax \V—fZis inductively defined by 

(h{v) ifvGL{G) 

minimax{v) := < ma,x{minimax{v') \ {v,v') £ K} if v ^ L{G) and MAX to move 
I mm{minimax{v') \ {v,v') £ K} ifv ^ L{G) and MIN to move 



Definition 4. Let G be a game tree with root v G V, and let s £ {MIN, MAX}, 
Formally, a strategy for player s, Ss = (Vs, Kg), is a subtree of G, inductively 
defined by 

— v GVs 

— If u G Vs is an internal node of T where s has to move there will exactly be 
one u' G E(u) with v! G Vs and (a,u') G Kg. 

— If u G Vs is an internal node of T where the opponent of s has to move, 
F{u) C Vs, and {u,u') G Kg will hold for all u' G F{u). 

Remark: A strategy is a subtree of G that proves a certain bound of the 
minimax value of the root of G. A MIN-strategy proves an upper bound of the 
minimax value of G, and a MAX-strategy a lower one. 

Definition 5. Two strategies will be called leaf-disjoint if they have no leaf in 
common (cf. Fig. |^. 
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Fig. 2. Two leaf-disjoint Strategies prove the lower bound 6 at the root. This is 
equivalent to the formulation that the lower bound 6 has the conspiracy number 

2 US). 



Definition 6. Let G = (T,h) = ((V, K),h) be a game tree. A best move is a 
move from the root to a successor which has the same minimax-value as the root 
has. Let m = {v,v') be such a move. We say m is secure with conspiracy number 
C and depth d if there exists an x £ Z so that a) there are at least C leaf disjoint 
strategies, with leaves at least in depth d, showing the minimax value of v' being 
greater or equal to x, and b) for all other successors of the root there are at least 
C leaf disjoint strategies, with leaves at least in depth d, showing the minimax 
value of them being less or equal to x. C is a lower bound for the number of 
terminal nodes of G that must change their values in order to change the best 
move at the root of G. 

Remark: Let ^be a game tree search algorithm. We distinguish between the 
universe, an envelope and a (current) search tree. A search tree is a subtree of 
the envelope. E.g., the minimax-algorithm and the a/3-algorithm may examine 
the same envelope, but they usually examine different search trees. Let v be 
a node. In the following r{v) is the set of all successors of v, concerning the 
universe. r'{v) is the set of those successors of v that are explicitly inspected by 
the algorithm A 

Definition 7. Let G = {{V,K),h) be a game tree. A value is a tuple w = 
(a,z) G { ’< ’ ) ’> ’ ) ’# ’ } X a is called the attribute of w, z the number of w. 
W X ^ is the set of values. We denote Wy = (a„, Zy) the 

value of the node v, with v GV. 

Remark: Let v be a node. Wy = {’<’ ,x) will express that there is a subtree 
below v the minimax-value of which is < x. Wy = {^>’ ,x) is analogously used. 
Wy = {’#’ ,x) implies that there exists a subtree below v the minimax-value 
of which is < x, and there is a subtree below v the minimax-value of which is 
> X. The two subtrees need not be identical. A value w\ can be ’in contradiction’ 
to a value W 2 (e.g. w\ = {’<’ ,5), W 2 = (”>’ ,6)^, ’supporting’ (e.g. w\ = 
{’<’ ,5),W2 = {’<’ ,&)), or ’unsettled’ ( e.g. wi = {’>’ ,5), W 2 = {’<’,&)). 
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Definition 8. A target is a tuple t = (w, (5, 7) with uj being a value and 7 G 
Mo. 

Remark: Let ty = (to, 6,^) be a target which is associated with a node v. 6 
expresses the demanded distance from the current node to the leaves of the final 
envelope. 7 is the conspiracy number o/t„. It informs a node on how many 
leaf-disjoint strategies its result must base. If the demand, expressed by a target, 
is fulfilled, we say that the target ty is fulfilled. 



3 Description of Cc2s 



3.1 The New Search Paradigm 



The left figure shows the data flow in our 
algorithm. In contrast to the minimax- 
algorithm or the a/3-algorithm we do not 
look for the minimax value of the root. We 
try to separate a best move from the oth- 
ers, by proving that there exists a number 
X so that the minimax value of the succes- 
sor with the highest payoff is at least x, 
and the payoffs of the other successors are less or equal to x. 

At any point of time the searched tree offers such an x and a best move m. 
As long as m is not secure enough, we take x and m as & hypothesis only, and 
we commission the successors of the root either to show that new estimations 
make the hypothesis fail, or to verify it. The terms of ’failing’ and ’verifying’ 
are used in a weak sense: they are related to the best possible knowledge at 
a specific point of time, not to absolute truth. New findings can cancel former 
’verifications’. The verification is handled by the help of the targets, which are 
split and spread over the search tree in a top down fashion. A target t expresses a 
demand to a node. Each successor of a node, which is supplied by a target, takes 
its own value as an expected outcome of a search below itself, and commissions 
its successors to examine some sub- hypotheses, etc. A target t will be fulfilled 
when t demands a leaf, or when the targets of all of v’s successors are fulfilled. 
When a target is fulfilled at a node v, the result ’OK’ is given to the father of 
V. If the value of v changes in a way that it contradicts the value component of 
t the result will be ’NOT-OK’. 

In the following example, the task is to find the best move, concerning 
a depth-2 fixed-depth envelope. The figure just below shows the incremental 
growth of strategies. 



prove that m is secure 
with depth 5+1 
and conspiracy X 
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The first step is to evaluate all 
successors of the root. Concern- 
ing a depth one search, U3 be- 
comes the ’best’ successor (i.e. 
the one with the highest pay- 
off) of the root. Thus, we con- 
jecture that V 3 will also be the 
best move concerning a depth-2 
search. We build the targets ti = 
((’<’ ) 3), 1, 1) for vi, the same for 
t>2, and ts = ((’>’ , 3), 1, 1) for V 3 . 
vi starts the verification process as it is the leftmost successor of the root. 
It generates its first successor and inquires whether or not, its value is < 3. U2 
does the same, and V 3 generates all successors and finds all of them being > 3. 
Since all partial expansions fit to the targets, v\, V 2 and V 3 return OK to the 
root. Thus, we know at the root that the move to V 3 is secure with depth 2. 

The targets consume much less memory than the conspiracy vectors of the 
conventional CNS and, moreover, they allow an efficient cutting mechanism, 
similar to the a/3-algorithm. We can make use of the fact that it is often expensive 
and superfluous to compute exact minimax-values, if we only need an upper resp. 
a lower bound on them. These properties make it possible to use Cc2s even for 
strategic searches, which need fine-grained evaluation functions. The top-down 
splitting of the targets offers high flexibility, which allows to realize various 
security concepts. In the following, we present a hybrid mixture of fixed-depth 
searches and conspiracy number searching. It is even possible to use tree forming 
heuristics such as Fail High Reductions 



1) Evaluate all successors of the root 




t(v3) = 

2) Assign targets ((’> ■ 3 ). 13 ) 
Prove that your value is ^ Prove that 

< 3 by a depth 1 search! L_J value is 

t(vl) = t(vl) >..7 3 by a 

depth 1 
search! 




4) Verified 






Is your value 
• <3 ? 




3.2 Algorithmic Details 

Values Are Updated Bottom Up A crucial point is how to react when a 
partial expansion step leads to values which contradicts a target. The task is to 
gather the new pieces of information, to draw a maximum of benefit from them, 
and at the same time to guarantee correctness and termination of the whole 
algorithm. The operator of Figure 0has been designed for that purpose. We 
give some examples of contradictions and their solution: Let u be a maxnode 
with a value (’<’ ,5). The value be determined by the fact that all successors 
of V have values (’<’ ,5). Now let us evaluate v by the help of our heuristic 
evaluation function and the result be that the direct value of v is something 
> 5, e.g. (’>’ ,7). Now, Update Value assigns the value (’#’ ,5) to u and solves 
the contradiction (11.9,10). Another example: Let u be a maxnode with value 
(’>’ ,5) and two successors. One successor has got the value (’<’ ,3). If the 
second successor has a value (’<’ ,10), or (’>’ ,3), or has not been evaluated 
at all, the value of v will remain (’>’ , 5) (1.6). If, however, the second successor 
gets the value (’<’ , 3), the new value will be built by the lines 9, 10, and v will 
get the value (’#’ ,3). 



Controlled Conspiracy-2 Search 



473 



value UpdateValue(node v) 

1 /* Let n be a MAX node */ 

2 if r'(v) ^ r{v) or 3s G r'{v) with Us = or 

3 3s € r'{v) with (as = and Zs > Zv) then { 

4 if = ’<’ and 3s € r'(v) with (a^ G {’>’ , } and Zs > Zv) 

5 then a„ := ; 

6 Zv ~ meix{zv , max{zs \ s G -T'(n), as G {’#’ }}; 

7 } else { 

s if o„ = ’>’ or (a„ = ’<’ and 3s G r'{v) with {as = and Zs > Zv)) 

9 then av := ; 

10 := max{zs \ s G -T'(i;)}; 

} /* If n is a MIN node the result is analogously defined. */ 

/* You only exchange < and >, and max by min. */ 

Fig. 3. Update the heuristic value of a node. 

There are some properties of UpdateValue, which make the combined values 
to more than only heuristic ones. 

Theorem 1. When inner values of nodes are gathered bottom up by the help 
of Update Value we can prove that (a) = {’<’ ,x) implies that there exists a 

subtree below v the minimax-value of which is < x, (b) Wv = {’>’ ,x) implies 
that there exists a subtree below v the minimax-value of which is > x, and (c) 
Wv = {’#’ ,x) implies that there exists a subtree below v whose minimax-value is 
< X and there exists a subtree below v the minimax-value of which is > x. These 
two subtrees are not necessarily identical. 

Moreover, it is obvious that there are no longer value contradictions at the nodes. 
For all proofs of the theorems we refer to the full version of this paper P]. 

Forming the Envelope and Handling Security In order to prove a lower 
bound at a maxnode, or an upper bound at a minnode, it is sufficient that one 
successor holds the bound. Such a node is called a ’cutnode’. If we want to 
prove an upper bound at a maxnode, all successors will have to hold the bound. 
Therefore, we call that type of node an ’allnode’. 

We define targets by the help of the following observations: Let u be a maxn- 
ode and let x be an upper bound on the minimax value of v. Let be the 
number of sons of v. Then x is an upper bound on the minimax values of all 
sons i G {1 . . . 5„}. If there are C-many leaf-disjoint strategies below each son 
of V, which prove the bound a;, there will be C leaf-disjoint strategies which prove 
the bound x below v, too. To be more general, if Ci is the number of leaf-disjoint 
strategies below the nodes Vi (for all sons Vi of v), the number of leaf-disjoint 
strategies that prove the bound x below v will be min^itj^Ci. 

Now, let f be a maxnode and x a lower bound of the minimax value of v. 
Let c be the number of sons of v which have a minimax value > x. (Because of 
the minimax rule there is at least one such successor.) Let Ci,i G {1 . . . c} be 
the number of leaf-disjoint strategies below Vi that prove the bound x at the 
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nodes Vi. Then the number of leaf-disjoint strategies that prove the bound x at 
the node v is Minnodes are analogously handled. 

Last but not least, if there is a strategy below v, the leaves of which have a 
distance of d to v, there will be strategies below all sons of v, the leaves of which 
have a distance of c? — 1 to the sons of v. 

When our algorithm enters a node v with target t, it decides whether is a 
cutnode or an allnode, concerning t. Then it builds sub-targets for all successors 
of u in a way that t will be fulfilled, when all sub-targets are fulfilled. 

The Search Algorithm Dealing with Security In the following, we assume 
that there is a heuristic evaluation procedure which either can return a heuristic 
point-value x, or which can answer whether x is smaller or greater than a given 
number y. 

We call the starting routine (no figure) at the root DetermineMove(root r, 
d, c = 2). d stands for the remaining depth and c for the conspiracy number 
which the user wants to achieve. If the successors of r have not been generated 
yet, DetermineMove will do this, and it will assign heuristic values of the form 
(’#’ , . . .) to all successors. It picks up the successor which has got the highest 
value X, and assigns a lower bound target of the form ((’>’ ,x),d,c) to the best 
successor and targets of the form ((’<’ ,x),d, c) to all other successors. Then it 
starts the procedure Cc2s on all successors. DetermineMove repeats the previous 
steps, until Cc2s returns with OK at all of r’s successors. 

Let r'{v) be the set of successors of v, as far as the current search tree is 
concerned. Let ty be the target for u, Wy the value of v. Let v\ . . . r’|r'(t;)| be the 
successors of v concerning the current search tree. Let . . . t|r'(u)| be the targets 
of the nodes v\ . . . r’|r'(i;)|) and let wi . . . W|p/(„)| be their values. We say that a 
node is OnTarget(r;, when the value of v is not in contradiction to the value 
component of ty . This will express that Cc2s still is on the right way. When Cc2s 
(figure EJ enters a node v it is guaranteed that v is on target and that the value of 
V supports ty (either by DetermineMove, or because of figure 01 11. 5-6). Firstly, 
Cc2s checks whether u is a leaf, i.e. whether ty is trivially fulfilled (1.1). This 
is the case when the remaining search depth of the target is zero {6y = 0) and 
the demanded conspiracy number i.e. the number of leaf-disjoint bound-proving 
strategies is 1 (y„ < 1). If v is not a leaf, the sub-algorithm PartialExpansion 
(no figure) will try to find successors of v which are well suited for a splitting 
operation. Therefore, it starts the evaluation of successors which either have not 
yet been evaluated, or which have an unsettled value in relation to the target 
ty = . ,x). If a successor s has been evaluated once before and is examined 

again it will get a point value of the form (’^’ ,y). For an argumentation of 
progress and termination, it is important that successors which have already 
been evaluated once before, and now are unsettled, are supplied with a point 
value. Since -values cannot be unsettled, each node can be evaluated at most 
twice! For a not yet examined successor s the evaluation function is inquired 
whether the value of s supports or contradicts the value component of s gets 
the value (’>’ , x) or (’<’ , a; — 1). PartialExpansion works from left to right, and 
updates the value of v after each evaluation. 
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bool Cc2s(node v, target = {a^, f3v,Sv,^v)) 

1 if {6v = 0 and < 1) or |-T(i;)| = 0 return OK; 

2 r ■- NOT.OK; 

3 while r = NOT.OK do { 

4 PartialExpansion(n, t„); 

5 if not OnTarget(n, t„) return NOT_OK; 

6 Split(n, t, . . .n|r'(«)|); /* assigns targets to the sons */ 

7 for i := 1 to |_T(ii)'| do { 

s r := Cc2s(«i,ti); 

9 Wv := UpdateValue(n); 

10 if not OnTarget(r), return NOT_OK; 

11 if r = NOT_OK break ; /* Leave the for-loop, goto 1.3 */ 

12 } 

13 } /* while ... */; 

14 return OK; 



Fig. 4. Recursive Search Procedure 



If V is an allnode and a partial expansion changes the value of u in a way that 
it contradicts the target t, Cc2s will immediately stop and leave v by line 11. 
If u is a cutnode, PartialExpansion will evaluate successors which have not yet 
been examined or which are unsettled in relation to t, until it will have found 
7„-many successors which support the value of v. 

After that, the target of the node v is ’split’, i.e. sub-targets are worked 
out for the successors of u, concerning subsection The resulting sub-targets 
are given to the successors of v, and Cc2s examines the sons of v, until either 
all sons of v will have fulfilled their targets (some successors may get so called 
null-targets, i.e. a target that is always fulfilled), or v itself is not ’on target’ 
any longer, which means that the value of v contradicts the current target of v. 
When a call of Cc2s returns with the result OK at a node v.i (line 8), the node 
v.i could fulfill its subtarget. When Cc2s returns with NOT-OK, some values 
below v.i have changed in a way that it seems impossible that the target of v.i 
can be fulfilled any more. In this case, Cc2s must decide, whether to report a 
NOT-OK to its father (line 10), or to rearrange new sub-targets to its sons (11. 
11 and 3). 

Theorem 2. If the algorithm terminates the result at the root will be based on 
an envelope with the desired properties. A minimax- algorithm would come to the 
same result, concerning that envelope. 



3.3 A Restriction of Cc2s, Compared to the a/3- Algorithm 

In this section, we would like to compare our algorithm with the most successful 
o;/3-algorithm. As the a/3-algorithm is not able to deal with conspiracy numbers, 
we restrict our algorithm to fixed-depth searches. Only do we refrain from the 
demand of 2 leaf-disjoint proving strategies. Let us call the resulting algorithm 
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Cels. The analysis of the a/3-algorithm is usually restricted to fixed-depth full- 
width game trees with a uniform branching factor. We follow this restriction. 

Let G be a fixed-depth full-width game tree with depth d and breadth b. In 
order to find the best move, the a/3-algorithm must evaluate the minimax value 
of the root of G. Therefore it examines at least 5L2J -|- 6I2I leaves 0. 

Theorem 3. (Effort) At its best case, the Ccls-algorithm evaluates _|_ 

(6 — 1) • leaves of G, in order to find the best move at the root. With 

regard to the number of evaluated leaves in the search tree this is optimal. 

Theorem 4. (Correctness) The decision-move which Cels finds is based on the 
minimax-value of G, i.e. every minimax-based algorithm will come to the same 
result. 



Theorem 5. (Termination) If G is finite (especially if G is a fixed-depth full- 
width game tree) Cels (as well as Cc2s) finishes its work in finite time. 



3.4 Cc2s Compared to Select-Expand-Update Based Algorithms 

Our new technique has two advantages over algorithms which expand a leaf node 
without being informed what the expansion is good for. In practice, it is often 
faster to decide whether or not the value is below or above a certain bound than 
to compute a point value. We profit from this. Moreover, we need not generate all 
sons of a node. We win a constant but decisive factor. E.g. we estimate the factor 
on about 50 at the game of chess. This is so much that it seems nearly impossible 
that any algorithm which is based on the Select-Expand-Update paradigm m 
is able to play high level chess. 



4 Experimental Results 

Firstly, we added some enhancements to the Cc2s algorithm: a) An evaluation 
is a depth-2 a/3-search plus quiescence search, b) Analogous to the Fail-High- 
Reduction technique, the remaining depth of a target wilt be further decreased, 
if an evaluation, refined by a small threat detection, indicates a cutnode. More- 
over, there are some minor important heuristics that speed up the convergence 
of Cc2s. None of these heuristics is allowed to contain chess specific knowledge. 

We are dividing the tests into two parts: The first part is based on the so 
called BT2630 test-set 0. It consists of 30 positions. Each one is supplied with 
a grandmaster-tuned solution move. The machine gets exactly 15 minutes per 
position. Those positions which have not been solved at the end of the given 
time count with 900 seconds. Those which are solved correctly, count with the 
number of seconds needed for computing the solution. 
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Results on the BT2630 test The test-set associates a pseudo- 

ELO-number to the result. (The 
ELO-system is a statistical mea- 
sure, which measures the relative 
playing strength of chess players.) 
with regard to this test-set, Con- 
NerS stands the comparison with 
two of the world top level chess pro- 
grams, Fritz and Hiarcs. 

Although the test-set stresses tactical performance, it is important to note 
that the program finds tactical lines when it uses a fine-grained evaluation func- 
tion. 

The second kind of testing is based on games. Here, the main challenge was 
to set our Cc2s algorithm into relation with a chess program that uses an a[3 
game tree algorithm, supplied with all enhancements: such as transposition ta- 
bles, sorting heuristics H21, FHR or Nullmove technique etc. 

We have selected 25 starting positions and have played two series of 50 games 
against Cheiron’97. That program uses the negascout algorithm supplied by 
Fail High Reductions, Transposition tables, killer heuristics, recapture and chess 
extensions. The program and its predecessors have proven to be successful in 
several tournaments. The main advantage, however, is that an evaluation func- 
tion and a depth-2 a/3-search for evaluations of ConNerS are available. Thus, we 
can exactly compare the remaining search of Cheiron’97 with our new algorithm. 

A first series of games started on Sparc 144 MHz machines, each side getting 
8 hours for 40 moves. It ended 26.5 to 23.5 for Cheiron’97. In a second series both 
programs got 4 hours for 40 moves. This fight ended 26.5 to 23.5, too, (although 
the single results differ). If we take both series together, we have a final result 
of 53 to 47. We are interpreting this result as Cheiron’97 and ConNerS being 
equally strong, although we are aware of the fact that these 100 games are not 
sufficient for a proof in a statistical sense. 

The greatest success of P.ConNerS (parallel version of ConNerS) was the win- 
ning of the 8*^ International Paderborn Computer Chess Championship, where 
we competed with world class programs like Nimzo or Shredder (World Cham- 
pion 1999). At the World Computer Chess Championship P.ConNerS ended up 
with 3 to 4 points, but post-tournament analysis revealed that minor opening 
book variations were the main reasons for the losses. 
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Abstract. We prove that several global properties (global convergence, 
global asymptotic stability, mortality, and nilpotence) of particular classes 
of discrete time dynamical systems are undecidable. Such results had 
been known only for point-to-point properties. We prove these proper- 
ties undecidable for saturated linear dynamical systems, and for con- 
tinuous piecewise affine dynamical systems in dimension three. We also 
describe some consequences of our results on the possible dynamics of 
such systems. 



1 Introduction 

This paper studies problems such as the following: given a discrete time dy- 
namical system of the form Xt+i = f{xt), where / : R" ^ R" is a saturated 
linear function or, more generally, a continuous piecewise affine function, decide 
whether all trajectories converge to the origin. 

We show in our main theorem that this global convergence problem is un- 
decidable. The same is true for three related problems: Stability (is the dynam- 
ical system globally asymptotically stable?). Mortality (do all trajectories go 
through the origin?), and Nilpotence (does there exist an iterate of / such 
that /'= = 0?). 

It is well-known that various types of dynamical systems, such as hybrid sys- 
tems, piecewise affine systems, or saturated linear systems, can simulate Turing 
machines, see e.g., In these simulations, a machine configuration 

is encoded by a point in the state space of the dynamical system. It then fol- 
lows that point-to-point properties of such dynamical systems are undecidable. 
For example, given a point in the state space, one cannot decide whether the 
trajectory starting from this point eventually reaches the origin. The results de- 
scribed in this contribution are of a different nature since they deal with global 
properties of dynamical systems. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 479-|52SI 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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Related undecidability results for such global properties have been obtained 
in our earlier work P], but for the case of discontinuous piecewise affine sys- 
tems. The additional requirement of continuity imposed in this paper is a severe 
restriction, and makes undecidability much harder to establish. Surveys of de- 
cidability and complexity results for dynamical systems are given in Q, m and 

m 

Our main result (Theorem Q is a proof of Sontag’s conjecture that 

global asymptotic stability of saturated linear systems is not decidable. Saturated 
linear systems are systems of the form Xt+i = cr{Axt) where Xt evolves in the 
state space R”, A is a square matrix, and tr denotes componentwise application 
of the saturated linear function a : R ^ 1] defined as follows: a{x) = x 

for |a;| < 1, <j{x) = 1 for a; > 1, cr(a:) = —1 for x < —1. These dynamical 
systems occur naturally as models of neural networks HM or as models of 
simple hybrid systems I2DEE]. 

TheoremHis proved in three main steps. First, in Section^ we prove that any 
Turing machine can be simulated by a saturated linear dynamical system with 
a strong notion of simulation. (Turing machines are defined in Section 0) Then, 
in Section 0 using a result of Hooper, we prove that there is no algorithm that 
can decide whether a given continuous piecewise affine system has a trajectory 
contained in a given hyperplane. Finally, we prove Theorem Q in Section El 
In light of our undecidability result, any decision algorithm for the stability 
of saturated linear systems will be able to handle only special classes of systems. 
In the full version of this paper g] we consider two such classes: systems of the 
form xt+i = a{Axt) where A is a nilpotent matrix, or a symmetric matrix. We 
show that stability remains undecidable for the first class, but is decidable for 
the second. 

Saturated linear systems fall within the class of continuous piecewise affine 
systems and so our undecidability results equally apply to the latter class of sys- 
tems. More precise statements for continuous piecewise affine systems are given 
in Section 0 Finally, some suggestions for further work are made in Sectional 
For some of our results we give complete proofs. For others we provide only 
a sketch, or we refer to the full version of the paper 0. 

2 Dynamical Systems 

In the sequel, X denotes a metric space and 0 some arbitrary point of X, to be 
referred to as the origin. When X C R”, we assume that 0 is the usual origin 
of R". A neighborhood of 0 is an open set that contains 0. Let f : X ^ X he a 
function such that /(O) = 0. We say that / is: 

(a) globally convergent if for every initial point Xq G X, the trajectory Xt+i = 
f{xt) converges to 0. 

(b) locally asymptotically stable if for any neighborhood U of 0, there is another 
neighborhood R of 0 such that for every initial point xq G V, the trajectory 
Xt+i = f{xt) converges to 0 without leaving U (i.e., x{t) G U for all t > 0 
and limt^oo Xt = 0). 
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(c) globally asymptotically stable if / is globally convergent and locally asymp- 
totically stable. 

(d) mortal if for every initial point xq G X, there exists t > 0 with xt = 0. The 
function / is called immortal if it is not mortal. 

(e) nilpotent if there exists fc > 1 such that the k-th iterate of / is identically 
equal to 0 (i.e., /^(x) = 0 for all x G X). 



Nilpotence obviously implies mortality, which implies global convergence; and 
global asymptotic stability also implies global convergence. In general, this is all 
that can be said of the relations between these properties. Note, however, the 
following simple lemma, which will be used repeatedly. 

Lemma 1. Let X be a metric space with origin 0, and let f : X ^ X be a 
continuous function such that /(O) = 0. // / is nilpotent, then it is globally 
asymptotically stable. Moreover, if X is compact and if there exists a neighbour- 
hood O of 0 and an integer j > I such that f^{0) = {0}, the four properties 
of nilpotence, mortality, global asymptotic stability, and global convergence are 
equivalent. 

Proof. Assume that / is nilpotent and let k be such that /^ = 0. Let U and V be 
two neighborhoods of 0. A trajectory starting in V never leaves ljf=o^ /*(^)- By 
continuity, for any U one can choose V so that f^{V) C U for alH = 0, . . . , fc — 1. 
A trajectory originating in such a V never leaves U. This shows that / is globally 
asymptotically stable. 

Next, assume that X is compact and that f^{0) = {0} for some neighbor- 
hood O of 0 and some integer j > 1. It suffices to show that if / is globally con- 
vergent, then it is nilpotent. If / is globally convergent, then X = U>o/-*(o)- 
By compactness, there exists p > 0 such that X = Ur=o/ *(0)- We conclude 
that /P+J(A) = {0}. □ 

A function / : R" — > R” is piecewise affine if R” can be represented as 
the union of a finite number of subsets Xi where each set Xi is defined by the 
intersection of finitely many open or closed halfspaces of R”, and the restriction 
of / to each Xi is affine. Let cr : R ^ R be the continuous piecewise affine 
function defined by: cr(a;) = x for |a;| < 1, cr(a:) = 1 for a; > 1, a{x) = — 1 
for a: < — 1. Extend ct to a function a : R” ^ R”, by letting cr(a:i, . . . ,a;„) = 
(cr(a;i), . . . , cr(xn)). A saturated affine function ( a -function for short) / : R” ^ 
R" is a function of the form f{x) = a{Ax -\- b) for some matrix A G Q" and 
vector b G Q” . Note that we are restricting the entries of A and b to be rational 
numbers so that we can work within the Turing model of digital computation. 
A saturated linear function ( ao-function for short) is defined similarly except 
that 6 = 0. Note that the function a : R” — > R” is piecewise affine, with the 
polyhedra Xi corresponding to the different faces of the unit cube [—1, 1]”, and 
so is the linear function f{x) = Ax. It is easily seen that the composition of 
piecewise affine functions is also piecewise affine and therefore cr-functions are 
piecewise affine. 
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Our main result is the following theorem. 

Theorem 1. The problems of determining whether a given saturated linear func- 
tion is (i) globally convergent, (ii) globally asymptotically stable, (Hi) mortal, or 
(iv) nilpotent, are all undecidable. 

Notice that deciding the global asymptotic stability of a saturated linear 
system is a priori no harder than deciding its global convergence, because the 
local asymptotic stability of saturated linear systems is decidable. (Indeed, a 
system Xt+i = a{Axt) is locally asymptotically stable if and only if the system 
Xt+i = Axt is, since these systems are identical in a neighborhood of the origin. 
Furthermore, a linear system is locally asymptotically stable if and only if all of 
its eigenvalues have magnitude less than one EH-) In fact, we conjecture that for 
saturated linear systems, global convergence is equivalent to global asymptotic 
stability. This equivalence is proved for symmetric matrices in the full version of 
the paper. If this conjecture is true, it is not hard to see that the equivalence of 
mortality and nilpotence also holds. 

Theorem^ has some “purely mathematical” consequences. For instance: 

Corollary 1. For infinitely many integers n, there exists a nilpotent saturated 
linear function f : R” ^ R" such that /^"^O. 

Of course, in this corollary, 2" can be replaced by any recursive function of n. 
In contrast, if / : R" ^ R" is a nilpotent linear function, then /" = 0. As a 
side remark, we note that it can be shown that this is not only true for linear 
functions, but also for polynomials and even more generally for real analytic 
functions. 

We conclude this section with two positive results: globally asymptotically 
stable saturated linear systems are recursively enumerable and so are saturated 
linear systems that have a nonzero periodic trajectory. The first observation is 
due to Eduardo Sontag, the second is due to Alexander Megretski. 

Theorem 2. The set of saturated linear systems that are globally asymptotically 
stable is recursively enumerable. 



Theorem 3. The set of saturated linear systems that have a nonzero periodic 
trajectory is recursively enumerable. 

The proofs of these results are based on elementary arguments, they can be 
found in the full version of the paper. Combining these two observations with 
Theorem Q we deduce that there exist saturated linear systems that are not 
globally asymptotically stable and have no nonzero periodic trajectories. 

Corollary 2. There exist saturated linear systems that are not globally asymp- 
totically stable and have no nonzero periodic trajectory. 



The Stability of Saturated Linear Dynamical Systems Is Undecidable 483 



3 Turing Machines 

A Turing machine M (mni is an abstract deterministic computer with a finite 
set Q of internal states. It operates on a doubly-infinite tape over some finite 
alphabet E. The tape consists of squares indexed by an integer i, — oo < i < oo. 
At any time, the Turing machine scans the square indexed by 0. Depending 
upon its internal state and the scanned symbol, it can perform one or more of 
the following operations: replace the scanned symbol with a new symbol, focus 
attention on an adjacent square (by shifting the tape by one unit), and transfer 
to a new state. 

The instructions for the Turing machine are quintuples of the form 



[q^,Sj,Sk,D,ql] 

where qi and Sj represent the present state and scanned symbol, respectively, 
Sfc is the symbol to be printed in place of Sj, D is the direction of motion (left- 
shift, right-shift, or no-shift of the tape), and qi is the new internal state. For 
consistency, no two quintuples can have the same first two entries. If the Turing 
machine enters a state-symbol pair for which there is no corresponding quintuple, 
it is said to halt. 

Without loss of generality, we can and will assume that A={0,l,...,n— 1}, 
Q = {0, 1, . . . , m — 1}, n, m S N, and that the Turing machine halts if and only 
if the internal state q is equal to zero. We refer to g = 0 as the accepting state. 

The tape contents can be described by two infinite words w\, W 2 € E‘^, 
where E‘^ stands for the set of infinite words over the alphabet E: wi consists of 
the scanned symbol and the symbols to its right; W 2 consists of the symbols to 
the left of the scanned symbol, excluding the latter. The tape contents (wi,W 2 ), 
together with an internal state q G Q, constitute a configuration of the Turing 
machine. If a quintuple applies to a configuration (that is, if g 0), the result 
is another configuration, a successor of the original. Otherwise, if no quintuple 
applies (that is, if g = 0), we have a terminal configuration. We thus obtain a 
successor function h: C ^ C, where C = E‘^ x x Q is the set of all con- 
figurations (the configuration space). Note that h is a partial function, as it is 
undefined when g = 0. A configuration is said to be mortal if repeated applica- 
tion of the function h eventually leads to a terminal configuration. Otherwise, 
the configuration is called immortal. We shall say that a Turing machine M is 
mortal if all configurations are mortal, and that it is nilpotent if there exists an 
integer k such that M halts in at most k steps starting from any configuration. 

Theorem 4. A Turing machine is mortal if and only if it is nilpotent. 

Proof. A nilpotent Turing machine is mortal, by definition. The converse will fol- 
low from Lemma [D In order to apply that lemma, we endow the configuration 
space of a Turing machine with a topology which makes its successor function 
h continuous, and its configuration space (A, d) compact. This is a fairly stan- 
dard construction and we refer the reader to the full version of the paper for a 
complete description. The constructed function h is identically equal to 0 in a 
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neighborhood of 0. We therefore conclude from Lemma d that if M is mortal, 
then it must be nilpotent. □ 

The next result is due to Hooper and will play a central role in the sequel. 

Theorem 5 ([l3j). The problem of determining whether a given Turing ma- 
chine is mortal is undecidable. 

In other words, one cannot decide whether a given Turing machine halts for 
every initial configuration. Equivalently, one cannot decide whether there exists 
an immortal configuration. 



4 Turing Machine Simulation 

A a* -function is a function obtained by composing finitely many cr- functions. 
It is well known that Turing machines can be simulated by piecewise affine 
dynamical systems |15I16I18| . Moreover, this simulation can be performed with 
a (T*-function (see the full version of the paper for the details of the construction 
of this function) . 

Lemma 2 ( |.15ill(>' 18] ~). Let M be a Turing machine and let C = x 17“ x Q 
be its configuration space. There exists a a* -function gM ■ R-^ — > and an 

encoding function u : C [0, 1]^ such that the following diagram commutes: 



C — ^ C 




R2 ^7^ r2 



(i.e. gM{v{c)) = v{c') for all configurations c, c' € C with c h c' ). 

We extend this results by proving that any Turing machine can be simulated 
by a dynamical system in a stronger sense. 

Lemma 3. Let M be a Turing machine and let C = 47“ x 47“ x Q be its con- 
figuration space. Then, there exists a a*-function gM ■ R^ — *■ R^, a decoding 
function v' : [0, 1]^ ^ C , and some subsets J\f°° C C [0, 1]^, ff^acc ^ 
such that the following conditions hold: 

T 9m{J^°°) C and = C. 

2. TVTijjj.g (respectively f\f^ ) is the Cartesian product of two finite unions of closed 

intervals in R. is at a positive distance from the origin (0,0) ofUf. 

3. For X € , the configuration v'{x) is nonterminal if and only if x G 

4-. The following diagram commutes: 

C — ^ C 



[ 0 , 1 ]^ 

(i.e. C{x) h C{gM{x)) for all x S 
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Intuitively, v' is an inverse of the encoding function v of Lemma 0 in the 
sense that v'{v{c)) = c holds for all configurations c. The set N°° is the image 
of the function v, consisting of those points x G [0, 1]^ that are unambiguously 
associated with valid configurations of the Turing machine. The set consists 
of those points that lie in some set and therefore encode an internal state 

g, a scanned symbol a, and a symbol j3 to the left of the scanned one. (However, 
not all points in J\f^ are images of valid configurations. Once it encounters a 
“decoding failure” our decoding function v' sets the corresponding tape square, 
and all subsequent ones to the zero symbol.) Finally, is the subset of 

associated with the nonterminal internal states g 0. See the full paper for 
complete details. 

Using Lemma 0 and Theorem 0 we can now prove: 

Theorem 6. The problems of determining whether a given (possibly discontin- 
uous) piecewise affine function in dimension 2 is (i) globally convergent, (ii) 
globally asymptotically stable, (Hi) mortal, or (iv) nilpotent, are all undecidable. 

The undecidability of the first three properties was first established in 0 . That 
proof was based on an undecidability result for the mortality of counter machines, 
instead of Turing machines. 

Proof. We use a reduction from the problem of Theorem 0 Suppose that a 
Turing machine M is given. Denote by g'j^^ the discontinuous function which is 
equal to the function gM of Lemma 0 on and which is equal to 0 outside 

of Af^acc- 

Since 0 is at a positive distance from we have a neighborhood O of 

0 such that g'f^{0) = {0}. By Lemma 1, all four properties in the statement of 
the theorem are equivalent. 

Assume first that M is mortal. By Theorem 0 there exists k such that M 
halts on any configuration in at most k steps. We claim that = 

{0}. Indeed, assume, in order to derive a contradiction, that there exists a tra- 
jectory Xt+i = with Xk+i yf 0. Since g'j^ is zero outside have 

Xt G for t = 0, . . . , k. By the commutative diagram of Lemma 0 the se- 

quence Ct = v'{xt) (t = 0, . . . , fc -I- 1) is a sequence of successive configurations 
of M. This contradicts the hypothesis that M reaches a terminal configuration 
after at most k steps. It follows that g'j^ satisfies properties (i) through (iv). 

Conversely, suppose that M has an immortal configuration: there exists an 
infinite sequence Ct of non-terminal configurations with ct F Ct+i for all t G N. 
By condition 1 of Lemma 0 there exists xq G with v'{xq) = cq. We claim 
that the trajectory Xt+i = gM^^t) is immortal: using condition 2 of Lemma 0 
it suffices to prove that xt G for all t. Indeed, we prove by induction on 

t that Xt G M),acc and v'{xt) = Ct for all t. Using condition 3 of Lemma 

0 the induction hypothesis is true for f = 0. Assuming the induction hypothesis 
for t, condition 1 of Lemma 0 shows that Xt+i G . Now, the commutative 
diagram of Lemma 0shows that v'fxt+i) = Ct+i, and condition 3 of Lemma 0 
shows that Xt+i G This completes the induction. Hence, g'j^^ is not mortal, 

and therefore does not satisfy any of the properties (i) through (iv). 
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5 The Hyperplane Problem 

We now reach the second step of our proof. Using the undecidability result of 
Hooper for the mortality of Turing machines, we prove that it cannot be decided 
whether a given piecewise affine system has a trajectory that stays forever in a 
given hyperplane. 

Theorem 7. The problem of determining if a given a* -function f : 

has a trajectory Xt+i = f(xt) that belongs to {0} x R^ for all t is undecidable 

Proof. We reduce the problem of Theorem 0 to this problem. 



Suppose that a Turing Machine M is given. Consider the cr* -function / : 
R3 ^ R3 defined by 



where gM is the function constructed in Lemma 0 and is a cr*-function 

that is equal to zero for x S and is otherwize positive (an explicit con- 

struction of this function is provided in the full version of the paper). Note that 
in the definition of the function / we use a nested application of the function cr. 
This is to ensure that the definition of / involves an equal number of applications 
of the cr function on all its components. 

Write (a;3, . . . , x'^) for the components of a point x of R^^. 

We prove that / has a trajectory Xt+i = f{xt) with xl = 0 for all t, if and 
only if Turing machine M has an immortal configuration. 

Suppose that / has such a trajectory. Since , and hence a(a{Zjg-i^^J), 

is strictly positive outside of we must have (xf , x^) G J^^acc 1 > 0. 

By the commutative diagram of Lemma 01 the sequence v'{x^,xl), t G N, is a 
sequence of successive configurations of M . By condition 3 of Lemma 01 none of 
these configurations is terminal, i.e. cq = v'{x^,x^) is an immortal configuration 



Conversely, assume that M has an immortal configuration, that is, there 
exists an infinite sequence of nonterminal configurations with ct Ct+i- The 
argument here is the same as in the proof of Theorem 6. By condition 1 of 
Lemma 01 there exists a point (xqjXq) G N°° with v'{xq,Xq) = cq. Consider the 
sequence defined by x^j^i) = gM{xl,xl) for all t. Since gM{J^°°) C N°° , 

we have (x^,xf) G Af°° for all f > 0. Using the assumption that configuration 
Ct is nonterminal and condition 3 of Lemma 01 we deduce that (xt,Xt) G J^^acc 
for all f > 0, which means precisely that the sequence Xt = {0,x^,x^), t G N, is 
a trajectory of /. □ 

6 Proof of the Main Theorem 

We now reach the last step in the proof, which consists of reducing the problem 
of Theorem 0 to the problems of Theorem 0 




of M. 
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Recall that a a-function is a function of the form f{x) = a{Ax + b) and a 
UQ-function is a function of the form f{x) = a{Ax). A composition of finitely 
many cro-functions is called a aQ -function. 

Lemma 4. The problems of determining whether a given -function 

is (i) globally convergent, (ii) globally asymptotically stable, (Hi) mortal, or (iv) 

nilpotent, are all undecidable. 

Proof. The problem of Theorem 0 can be reduced to the mortality problem 
for (Tg -functions. The construction is such that the (Tq - function is equal to zero 
in an neighborhood of the origin (see the full paper for the construction of the 
function). It therefore follows from Lemma0that for this function, the properties 
(i)-(iv) are equivalent. These four properties are therefore undecidable. □ 

We can now prove Theorem 0 



Proof, (of Theorem nj) We reduce the problems in Lemma 0 to the problems in 
Theorem 0 

Let / : R4 ^ R4 be a (Tg -function of the form f = f^o fk-i o . . .o fi for some 
(Tg-functions fj{x) = a(Ajx), where fj : ^ R'^^ with do,di, . . . ,dk € N, 

and do = dk = 4. 

Let d = do -\- di dk, and consider the saturated linear function f' : 

R^^ — > R'^ defined by f{x) = a{Ax) where 



A = 



/O 0 . . 

Ai 0 . . 

0 A2.. 



Ak\ 
0 
0 



: 0 0 
Vo 0 ...Ak-iO 

Clearly, the iterates of this function simulate the iterates of the function /. 

Suppose that /' is mortal (respectively nilpotent, globally convergent, glob- 
ally asymptotically stable). Then, the same is true for /: indeed, when Xt+i = 
f{xt) is a trajectory of /, the sequence {xt, fi{xt), .. ., fk-i o .. . o fi{xt)) is a 
subsequence of a trajectory of /'. 

Conversely, let x^_^_l = f'(x^) be a trajectory of /'. Write x^ = {y),. . . ,y^) 
with each of the in RA-i . For every fg S {0, . . . ,k — 1} and j G {!,..., k}, 
the sequence t uig+kt ^ trajectory of /. This implies that the sequence y(, 
t S N is eventually null (respectively, converges to 0) if / is mortal (respectively, 
globally convergent). For the same reason, the global asymptotic stability of / 
implies that of /'; and if /"* = 0 for some integer m, we have (/')^™ = 0. □ 



7 Continuous Piecewise AfRne Systems 

We proved in Theorem Elthat it cannot be decided whether a given discontinuous 
piecewise affine system of dimension 2 is globally convergent, globally asymp- 
totically stable, mortal, or nilpotent. We do not know whether these problems 
remain undecidable when the systems are of dimension 1. 
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For continuous systems, we can prove the following. 

Theorem 8. For continuous piecewise ajfine systems in dimension 3, the four 
properties of global convergence, global asymptotic stability, mortality, and nilpo- 
tence are undecidable. 

Proof. The system built in the proof of Lemma 0 is of dimension 4. The con- 
struction can be adapted to a system of dimension 3. See the full paper. □ 

The following proposition is proved in . 

Theorem 9. For continuous piecewise affine systems in dimension 1, the prop- 
erties of global convergence, global asymptotic stability, and mortality are decid- 
able. 

One can also show that nilpotence is decidable for this class of systems. Thus, 
all properties are decidable for continuous piecewise affine systems in dimension 
1, and are undecidable in dimension 3. The situation in dimension 2 has not 
been settled. 

Global properties of / : R” ^ R" n = 1 n = 2 n = 3 

Piecewise affine ? Undecidable Undecidable 

Continuous piecewise affine Decidable ? Undecidable 



8 Final Remarks 

In addition to the two question marks in the table of the previous section, several 
questions which have arisen in the course of this work still await an answer: 

1. Does there exist some fixed dimension n such that nilpotence (or mortality, 
global asymptotic stability and global convergence) of saturated linear sys- 
tems of dimension n is undecidable? A negative answer would be somewhat 
surprising since there would be in that case a decision algorithm for each n, 
but no single decision algorithm working for all n. 

2. It would be interesting to study the decidability of these four properties for 
other special classes of saturated linear systems, as we have already done 
for nilpotent and symmetric matrices. For instance, is global convergence or 
global asymptotic stability decidable for systems with invertible matrices? 
(Note that such a system cannot be nilpotent or mortal.) Are some of the 
global properties decidable for matrices with entries in { — 1, 0, 1}? 

3. For saturated linear systems, is mortality equivalent to nilpotence? Is global 
convergence equivalent to global asymptotic stability? (This last equivalence 
is conjectured in Section 2.) We show in the full version of the paper that 
these equivalences hold for systems with symmetric matrices. 

4. For a polynomial map / : R" ^ R" mortality is equivalent to nilpotence; 
these properties are equivalent to the condition /" = 0, and hence decidable. 
It is however not clear whether the properties of global asymptotic stability 
and global convergence are equivalent, or decidable. 
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5. Does there exist a dimension n such that for any integer k there exists a 
nilpotent saturated linear system / : R" ^ R" such that ^ 0? Note that 
this question (and some of the other questions) still makes sense if we allow 
matrices with arbitrary real (instead of rational) entries. 
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Abstract. We establish a first step towards a “Rice theorem” for tilings: 
for non-trivial sets, it is undecidable to know whether two different tile 
sets produce the same tilings of the place. Then, we study quasiperiod- 
icity functions associated with tilings. This function is a way to measure 
the regularity of tilings. We prove that, not only almost all recursive func- 
tions can be obtained as quasiperiodicity functions, but also, a function 
which overgrows any recursive function. 



1 Introduction 

Tilings have been studied for a very long time from different points of views. 
In 1961, Hao Wang introduced the formalism of colored square tiles (now called 
Wang tiles) in He was motivated by the problem of decidability of the 
satisfiability problem of a class of formula defined by the structure of its prenex 
normal form: the Kahr class K = [V3V, (0, w)] (see jSj). A tile set t can be 
recursively transformed into a formula TV of the Kahr class such that 

— the plane can be tiled by r if and only if the formula TV has a model, 

— the plane can be tiled by t periodically if and only if the formula TV has a 
finite model. 

Whether the plane can be tiled by t was then proved undecidable by Berger 
in 1966 (domino problem P)) and the periodic case was proved undecidable by 
Gurevich and Koriakov in 1972 m These proofs are based on a complicated 
construction due to Berger and clarified later 01171 nail]. We also use this con- 
struction in the proofs of our theorems. As far as we know, nobody could prove 
the undecidability of Kahr class without using tilings, and the undecidability of 
the 8 other classes (necessary to close Hilbert’s Entscheidungsproblem) is proved 
by reduction to the Kahr class. This justifies the importance of studying decision 
problems over tilings and also their periodicity aspects. 

Thus, tilings became basic objects and were often used as tools for proving 
undecidability results for planar problems (see I0|) and were also broadly 
used in complexity theory [UiU El ISl Hj. Later, Wang tiles were used as 
models for physical constraints in or mainly for studying quasicrystals 
(see PSI). Quasicrystals are related to a property called quasiperiodicity which is 
an extension of periodicity (called uniform recurrence in 1-dimensional language 
theory). 

In this paper, our first goal is to prove a Rice theorem for tilings. In recur- 
sivity theory, this theorem says that given any property on functions, the set of 
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programs that compute this function is either trivial [i.t. empty or full) or non 
recursive. Intuitively, it seems that problems concerning tilings are also either 
trivial or undecidable. But the formalisation of this intuitive assertion is not 
at all trivial and we do not know any formula that could be a good candidate 
to apprehend this idea. Thus, we restrict to a slightly more restrictive version 
analogous in recursivity theory to the following proposition: given a program P, 
the set of programs that compute the same function as P is non recursive. In 
our tiling framework we prove that given a non-trivial tile set r, the set of tiles 
that produce the same tilings of the plane as r is non recursive (Th. 0. The 
same construction allows us to prove an analogous theorem concerning limit sets 
of cellular automata (Th. 0, thus improving Kari’s result in uni 

In the sequel, we focus on the notion of quasiperiodicity (which extends peri- 
odicity). In 01^, an analogous of Furstenberg’s lemma (see ^D|) was proved for 
tilings: if a tile set can tile the plane, then it can also be used to form a quasiperi- 
odic tiling of the plane. This means that even with a complicated set of tiles, 
one cannot force tilings to be very “irregular”, “chaotic”, or “complex” in an 
intuitive meaning, because some tilings will be quasiperiodic. Using a (strange) 
terminology used in physics: local constraints cannot force chaos. This result is 
rather surprising because using an adaptation of Berger’s construction one can 
built some tile sets that can tile the plane, but such that none of the obtained 
tilings are recursive. The regularity of quasiperiodic tilings can be measured by 
their quasiperiodicity function. We prove that given any “natural” function /, 
one can construct a tile set such that the quasiperiodicity function of all tilings 
obtained is / - where “natural” is formalized near time-constructibility - (Th.EJ. 
Our last result (Th. E) states that there exists a tile set whose quasiperiodic- 
ity function grows to infinity faster than any recursive function. The intuitive 
meaning of this theorem is that although this tile set produces “regular” {i.e. 
quasiperiodic) tilings, the regularity of these tilings cannot be observed - by 
computer. 

2 Definitions 

The classical way to consider tilings of the plane is to use Wang tiles fSl IZOI • 
A Wang tile is a unit square tile whose edges are colored. A tile set is a finite 
set of Wang tiles. A configuration consists of tiles which are placed on a two 
dimensional infinite grid. A tiling is a configuration in which two juxtaposed 
tiles have the same color on their common border. 

A pattern is a restriction of a configuration to a finite domain of A square 
pattern of size a is a pattern whose domain is |1, a] x |1, a]. 

In order to prove a Rice theorem for tilings, we need to be able to compare 
tilings. If we use Wang’s definition introduced above, we can’t compare the tilings 
produced by two different tile sets, because they may use different colors. 

Thus, we chose to use a more general modelization of tilings: local constraints 
(see also PD- Let us focus on binary maps, i.e. configuration of 0 and 1. We 
“constrain” binary maps by imposing that all patterns of a fixed domain T> 
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extracted from the produced binary maps belong to a certain constraint set 
(which will be represented by a function from the set of patterns of domain T> 
to {0, 1}). This is what we will call a local constraint. 

Definition 1. We call local constraint a pair c = (V,/), where V is a vector 
of n elements of and f a function from {0,1}" to {0,1}. V is called the 
neighborhood and f is called the constraint function. 

A binary map is a function from to {0,1}.^ binary map p verifies a local 
constraint c = (/, (ui, . . . , u„)), or is produced by c, if and only if 

Vx € f{p{x + vi),p{x + U 2 ), . . . ,p{x + u„)) = 0. 

A local constraint is trivial if and only if it is verified by all binary maps, and 
non-trivial on the other case. 

We will denote by V{c) the set of all binary maps verifying c. 



Definition 2. A binary pattern M is a pair {V, /) where T> is a finite sequence 
of and f a function from V into {0, 1}. V is called the domain of M . For 
all X ofV, we note M{x) for f{x). 

A binary pattern M of domain V is extracted from a binary map m if and 
only if there exists a position (i,j) such that : 

Mz G V, M{z) = m{z + {i,j)) 

Now, we define some key properties of binary patterns. 

Definition 3. We say that a binary pattern M of domain V cancels out a 
constraining function f of a local constraint whose neighborhood is {ni)i<i<k, 
if and only if for all x in V such that all the x + Ui are in V, f{M{x + ni,x + 
U2, ■ ■ ■ ,X + Uk) = 0. 

A pattern verifies a local constraint c if and only if it cancels out the constraint 
function of c. Otherwise, it is called a forbidden pattern for c. 



Remark 1. A binary map verifies a local constraint if and only if all its patterns 
verify this local constraint. 

In some way, the approach of planar tilings with local constraint is equiv- 
alent to the tiling of the plane using Wang tiles: we can associate to a Wang 
tile set a local constraint such that each Wang tile corresponds to a certain pat- 
tern of 0 and 1, and such that the plane is tilable [resp. periodically tilable] by 
the Wang tile set if and only if there is a binary map which verifies the local 
constraint. Conversely, we can associate to each local constraint a Wang tile set 
such that each Wang tile corresponds to an accepted pattern, and verifying the 
same equivalence of tilability. 
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3 Towards a Rice Theorem for Tilings 

Thanks to the notion of local constraints, we are able to formulate a theorem 
which is not exactly a Rice theorem for tilings but is a first step in the direction 
of this goal. Given two local constraints a and 6, we are now able to compare the 
two sets V{a) and V{b) of those tilings they produce since they both are sets of 
binary maps. 

Let c be a local constraint. We call SM(c), the following problem: 

Problem: SM(c) 

Instance: A local constraint i/ 

Question: Does the local constraint v produce exactly the same binary maps 
as c, (j.e. V{v) = 'P(c))? 



Theorem 1. Let c be a non-trivial local constraint. Then, SM(c) is undecidable 
(more precisely Si-complete). 

Proof. The idea of the proof is that this problem is at least as difficult to solve as 
to decide whether there exists a binary map which verifies a given local constraint 
(domino problem). We choose a forbidden pattern F for c. Then, from a local 
constraint q, we build a local constraint q which produces strictly more maps 
than c if and only if q produces a binary map. In fact, we use T’ as a delimiter 
to code binary maps in the “language” of c. 

The following decision problem is called “domino problem” and expressed 
here for local constrains. It was proved undecidable by Breger in ||: 

Problem: Domino 

Instance: A local constraint i/ 

Question: Does a binary map that verifies ly {i.e. V{v) ^ 0) exist? 

This problem has been proved in [2| to be undecidable. 

Without loss of generality, we can consider only local constraints whose 
neighborhood is |l,a] x |l,a]. Let c = (|l,a] x |l,a],(5) be a non-trivial lo- 
cal constraint. As c is non-trivial, there exists a forbidden binary pattern F = 
(|1, a] X |1, a], /). Thus, any binary map which contains the pattern F does not 
verify c. 

Let’s prove that the set of all local constraints which are verified by the same 
binary maps as c is Ai-complete for many-one reductions. To prove that it is 
recursively enumerable (which is not completely straightforward), we show that 
the set of all pairs of local constraints which produce exactly the same binary 
maps is recursively enumerable. 

Let / be a program which, given a pair of local constraints (/r, v), executes 
the following steps. If |l,m] x |I,m] is the domain of p. and |l,n] x |I,n], the 
domain of v: 

— If ^ is trivial then halt. 

— For all forbidden binary pattern F of domain |l,m] X |l,m] of p do 

• i = 0 

• Repeat (loop a) 

* Check if all patterns of domain |l,m-|-2i] X |l,m-|-2i] whose 
centered subpattern is F are forbidden for n. 
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* If yes, exit this loop, else increase i 
• End Repeat 

— End For 

Suppose fj, is non-trivial and assume that /(/i, ly) halts. Let P be a binary 
map that does not verify /r. Then, it contains a forbidden pattern F of domain 
|l,m] X at position {x,y). As the program exits the loop a when F is 

treated, there exists an integer i such that any pattern of domain |1, m + 2t] x 
|l,m + 2i\ whose center subpattern is F is forbidden for v. Thus, the pattern 
of P of position {x — i,y — i) is forbidden for i/. Hence, P does not verify i/. All 
binary maps which don’t verify y, don’t verify v. 

Conversely, suppose f{y, v) does not halt. Then, there exists a pattern F 
for which the loop a never ends. This means that for all integers i, there exists 
a pattern of domain |l,m + 2i] x |l,m + 2i] whose center subpattern is F 
verifying v. Then, by standard diagonal extraction (also called Konig’s lemma, 
or countable Tychonoff theorem, see |0| for an explanation of these notions in the 
world of tilings) there exists a binary map i? which contains F and is produced 
by i^. i? verifies v but not y. 

We just proved that f{y, v) halts if and only if all binary maps that verify v 
also verify y. This is also true when y is trivial. Let M be the Turing machine 
which, on input (y,v), compute f{y,v) and then f{v,y). M halts on {y,v) if 
and only if y and v produce the same binary maps. The set of pairs of local 
constraints which produce the same binary maps is recursively enumerable. 

Let’s now prove that the set of all local constraints which are verified by the 
same binary maps as c is Ai-hard. Let g be a local constraint whose domain is 
[l,/lxIl,Zl. _ 

For all binary patterns M of domain |1, x\ x |1, s], we’ll note M the pattern 
of domain |1, a(2a;+l)] x |1, a(2a;+l)] illustrated by Fig.Q] and build as follows. 
The pattern M is a juxtaposition of 2a; + 1 by 2a; + 1 square patterns of size a. 
Those patterns are: 

— For all i € |0,a;], and all j G |0,a;], the pattern located at position (2a + 

1, 2j ■ + 1) is equal to F; 

— For all i G fl,a;l, and all j G |l,a;|, the pattern located at the position 

(2i,2j) filled with 

For any binary map P, let’s define the binary map P as follows: V(to, n) G 
V(a,j)ell,a]xll,a], 



P{2na + i, 2ma + j) = P{n, m); 

P{na + i, {2m + l)a + j) = P{{2n + l)a + i, ma + j) = F{i,j). 

This definition is the extension to binary maps of the previous definition. 

Finaly, let’s define the local constraint q whose domain is |1, 2/+1] x |1, 2Z+1] 
and whose constraint function is /. The value of / is 0 only on the following 
binary patterns: 
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Fig. 1. A sample pattern L, and its associated L 



— all binary patterns which verify c (rule 1); 

— all sub-patterns of domain |1, 2Z -|- 1] x |1, 2Z -|- 1] of all N such that N has 
|1, Z -I- 1] X |1, Z -I- 1] for domain and verifies q (rule 2). 



Any binary map verifying c verifies q because of rule 1. Moreover, if a binary 
map m verifies q, fh verifies q because of rule 2. Hence if there is a binary map 
verifying q, q verifies strictly more maps than c. 

Conversely, if there is a binary map m that verifies q but not c, then rule 
2 is used at least once, since if not, m would verify c. We deduce that there 
is a pattern A = M in m, when M is a pattern of domain |1,Z] x |1,Z], and 
which cancels out the constraint function of q. Since F doesn’t cancel out the 
constraint function of c, the rule 2 is applied to all the patterns that contain 
F, i.e. to all the patterns whose intersection with A is of at least the size of F. 
Thus only the rule 2 is applied. This implies that m = Z, for a Z which verifies 
q. Hence, if q produces strictly more maps than c, then there is a map which 
verifies q. 

We can conclude that there is a binary map which verifies q if and only if 
q is verified by more maps than c. As q can be easily constructed from g by a 
Turing machine, we have constructed a many-one reduction and the set of all 
local constraints which verify exactly the same binary maps as c is many-one- 
complete, thus not recursive. □ 



Thanks to the same kind of proof, we can state an analogous result for cellular 
automata (see ^ for a modern book on this topic). A part of the proof (enu- 
merability) is more camplicated and uses other arguments; it will be published 
elswhere. 
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Definition 4 (limit set). Let C be a planar (i.e. 2-dimensionnal) cellular au- 
tomaton with two states (0 and 1). The limit set C{C) of C is defined as follows: 

= { 0 , 1 }^' 

Vn > 1 

c{c) = Pi 

iGN 

Let C be a planar cellular automaton, with only two states. We call LS(C), 
the following problem: 

Problem: LS(C) 

Instance: A planar cellular automaton A, with exactly two states 
Question: Are the limit sets of A and C the same, {i.e. C{A) — C{C))1 

The theorem we prove with almost exactly the same proof as Th. Q] is the 
following: 

Theorem 2. Let C be a planar cellular automaton, with two states. Then the 
problem LS(C) is undecidable (more precisely Ei-complete). 

This theorem improves a result proved in HSI, where the number of states of 
the considered cellular automata is not bounded. 



4 Quasiperiodicity 

We now present a notion that gives us an idea on the regularity of a tiling. 
Indeed, a tiling can be more or less regular, that is to say its patterns can be 
repeated more or less often. 



4.1 Definition 

We introduce the notion of quasiperiodicity which is a generalisation of the 
notion of periodicity (called uniform recurrence in language theory). A tiling is 
quasiperiodic if and only if for each of its subpatterns, there exists a size of a 
window such that, in any place we put the window on the tiling, we can find a 
copy of this pattern within the window (see [0| ) . 

Definition 5. Let P be a configuration made of Wang tiles. P is quasiperiodic 
if and only if for all patterns AI extracted from P, there exits an integer n such 
that AI appears in all square patterns of size n extracted from P. 

The following theorem PI is analogous of Furstenberg lemma in language 
theory. 

Theorem 3. Lf t is a tile set which can tile the plane, then there exists a 
quasiperiodic tiling of the plane made with tiles of t. 
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We can now define quasiperiodicity functions, which allow us to describe the 
regularity of quasiperiodic configurations. 

Definition 6. Let c he a quasiperiodic configuration. We call quasiperiodicity 
function of c, the function which maps positive integers n to the smallest integer 
Qc{n) such that in all square patterns of size Qfn) extracted from c, we can 
observe a sample of all the square patterns of size n that appear in c. 



4.2 Quasiperiodicity Functions 

We now study what kinds of quasiperiodicity functions can be obtained. We first 
prove that “usual” increasing recursive functions can be obtained. 

Definition 7. A function f is time-constructible if and only if there is a Turing 
machine M over the alphabet {0, 1} which stops after f(i) steps of computation 
on the entry V . 



Theorem 4. Informal version .• any increasing time-constructible functions 
can be observed as a quasiperiodicity function of a tiling; 

Formal version .■ for all time-constructible function f , there exists a tile set r 
such that, for all tiling p produced by t, 

Qp(x) > (jf(x) + 3)^ + 1 > ^ h 1 



where 

X 

= ^(i + l)/(i) 

i=0 

Proof. Let M be a Turing machine which witnesses that / is time-constructible. 

The idea of the proof is to emulate the space-time diagram of the computation 
of M on the entry i, and then, to include a new pattern of size z-l- 1 in the tiling, 
begining by i = 0, then i = 1, etc. 

In order to do that, we emulate evolution of Turing machines in tilings using 
Berger’s well-known construction described in |2j. This difficult technique was 
invented for proving the undecidability of the domino problem punun 
Let {go, 9i, • ■ . , 9fe} be the states of the machine M and g/ its final state. 

In order to be sure that a new small pattern won’t appear in the space-time 
diagram of M, which would increase the value of the quasiperiodicity function, 
we “stuff” the diagram with ft. That is to say we replace this diagram by another 
n times bigger, in which each character a is replaced by the nxn following square 
represented Fig. |2| This is what we’ll call an n-stuffed diagram. 

In each area of computation of the construction, we produce the following 
diagrams, one above the other, starting with n = 0, increasing n by 1 at each 
step: 
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a # ••• # 
##•••# 

##■■■# 

Fig. 2. A stuffed letter 



l"o 1 

: > computation of M on 1", n-stuffed 

l"o J 

'I 

: > n lines 

J 

l”o#“ 'I 

: > n lines 

J 

As we list all the letters (0 and 1) and all the states of the Turing machine, we 
are sure that no new patterns of size at most n can appear. 

From step i to i + 1, first, the information that a final state has occured is 

sent to the left margin. Then the number of 1 before the o is increased by 1. 

This can be done as shown Fig. 0 




Fig. 3. How to increase the number of 1 



Once we have increased the number of 1, we add the step i + 1. The i + 1- 
stuffing is illustrated Fig. 0 where the big squares represent the stuffed letters 
of Fig 0 except for the left column which is the margin of the 1 before the o. 
Nevertheless, the sub-pattern of size i of such squares of size strictly greater than 
i are the same, even considering the construction arrow. It proves that no new 
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Fig. 4. How to z-stufF the time-space diagram 



pattern of size z appears in step z -I- 1, and also i + 2, etc. The minimum size 
of window which is sure to contain those z first steps is the smallest that is big 
enough to contain all sub-patterns of size i. 

Thanks to Berger’s construction, we know that all windows of size 2^"+^ -I- 1 
contain the 2"+^ — 3 first lines of the construction. Thus, to obtain all patterns 
of size X, we need x first steps of the construction. The size of these steps is 
exactly ^f{x). So we need a window size of 2^”+^ -I- 1, where n is the smallest 
integer such that 2”+^ — 3 >')f{x). We deduce that if q is the quasiperiodicity 
function of this tiling, 

q{x) = + 1 

and so that 

> ( 5 /( 2 ^) + 3)^ -I- 1 > ^ 1- 1 

□ 

Since we have proved that for almost any increasing recursive function /, we 
can construct a tile set which produces quasiperiodic tilings whose all quasiperi- 
odicity functions are nearly /, we construct a tile set which produce quasiperiodic 
tilings whose quasiperiodicity function is greater than all recursive functions. 

Theorem 5. There exists a tile set tq such that, for all tiling p produced by tq, 
p is quasiperiodic, and no recursive function is an upper hound of Qp. 

The regularity of any quasiperiodic tiling produced by tq cannot he observed 

Proof. Let K he a, non-recursive, recursively enumerable subset of {0, 1}* (e.g. 
the set of all pairs {x, y) such that the Turing machine of number x halts on 
entry y). 

The idea of the proof is to build a tile set which emulates a Turing machine 
which enumerates K. If we can compute a size of frame in which all patterns 
of size n appear, then we can obtain an upper bound for the number of steps 
needed by the machine to output all elements of K of size at most n. This would 
lead to a decision algorithm for K, which is impossible. 

Let M be a Turing machine over the alphabet {0,1,#}, with two tapes, the 
first one being the working tape, and which, on the empty entry, enumerates 
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the elements of K, and outputs them on the second tape, separated by in 
increasing order. 

For instance, if the enumeration of K is 23, 67, 45, 12, 36, 52, 38, . . ., the ma- 
chine will output: 

23 

23#67 

23#45#67 

12#23#45#67 

12#23#36#45#67 

12#23#36#45#52#67 

12#23#36#38#45#52#67 



In the sequel, we use the Berger’s construction (see P). 

Let To be the tile set that, through Berger’s construction emulates this Turing 
machine M . Let p be a tiling produced by tq. Suppose there is a recursive function 
/ such that for all x, we have f{x) > Qp{x). Let’s prove that K is recursive. 

From Berger’s construction, we observe that there exists a recursive function 
p, such that n cells of the second tape are represented in any p(n) x p(n) patterns 
of any tiling produced by tq. In order to decide whether x belongs to K, we have 
to know if it is enumerated by M . 

Suppose x is enumerated. It is written at most at a = p(I -I- * + 1) cells 

from the beginning of the second tape, since the machine output at most all the 
numbers between 0 and x to the left of a;. As all the square patterns of length a 
are found in all patterns of length /(a), if x is enumerated by M then can 
be observed in all /(a) x /(a) patterns found in any tiling produced by tq. As in 
this construction all tilings contain the same (finite) patterns - we say that they 
are mutually extractible (see 0 0) ~ if #3^# can be observed in a f{a) x /(a) 
pattern of a tiling produced by tq, then x is enumerated by M. But thanks to 
Berger’s construction, we can construct square patterns of arbitrary size that 
can be extended to tilings of the plane produced by tq. Hence we can decide 
whether x is or not enumerated by M which contradicts the non-recursivity of 
K. □ 
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Abstract. A potential maximal clique of a graph is a vertex set that 
induces a maximal clique in some minimal triangulation of that graph. 
It is known that if these objects can be listed in polynomial time for a 
class of graphs, the treewidth and the minimum fill-in are polynomially 
tractable for these graphs. We show here that the potential maximal 
cliques of a graph can be generated in polynomial time in the number of 
minimal separators of the graph. Thus, the treewidth and the minimum 
fill-in are polynomially tractable for all graphs with polynomial number 
of minimal separators. 



1 Introduction 

The notion of treewidth was introduced at the beginning of the eighties by 
Robertson and Seymour ji^bl l'3bj in the framework of their graph minor the- 
ory. A graph H is a, minor of a graph G if we can obtain H from G by using the 
following operations: discard a vertex, discard an edge, merge the endpoints of 
an edge in a single vertex. Among the deep results obtained by Robertson and 
Seymour, we can cite the fact that every class of graphs closed by minoration 
which does not contain all the planar graphs has bounded treewidth. 

A graph is chordal or triangulated if every cycle of length greater or equal to 
four has a chord, i.e. edge between two non-consecutive vertices of the cycle. A 
triangulation of a graph is a chordal embedding, that is a supergraph, on the 
same vertex set, which is triangulated. The treewidth problem consists in finding 
a triangulation such that the size of the biggest clique is as small as possible. 
Another closed problem is the minimum fill-in problem. Here we have to find a 
triangulation of the graph such that the number of the added edges is minimum. 
In both cases we can restrict to minimal triangulations, i.e. triangulations with 
a set of edges minimal by inclusion. 

The treewidth and the minimum fill-in play an important role in various areas 
of computer science e.g. sparse matrix factorization and algorithmic graph 
theory |31 El 121 IH| ■ For an extensive survey of these applications see also mu\ 

Unfortunately the computation of the treewidth and of the minimum fill-in 
of a graph are NP-hard even for co-bipartite graphs. 

There exist several classes of graphs with unbounded treewidth for which we 
can solve polynomially the problem of the treewidth and the minimum fill-in. 
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Among them there are the chordal bipartite graphs ^ , circle and circular- 
arc graphs EiEni, AT-free graphs with polynomial number of minimal separa- 
tors m- Most of these algorithms use the fact that these classes of graphs have 
a polynomial number of minimal separators. It was conjectured in [niiiHi that 
the treewidth and the minimum fill-in should be tractable in polynomial time 
for all the graphs having a polynomial number of minimal separators. We solve 
here this ESA’93 conjecture. 

The crucial interplay between the minimal separators of a graph and the 
minimal triangulations was pointed out by Kloks, Kratsch and Muller in I2H, 
these results were concluded in Parra and Scheffler Two minimal separators 
S and T cross if T intersects two connected components of G\S, otherwise they 
are parallel. The result of m states that a minimal triangulation is obtained 
by considering a maximal set of pairwise parallel separators and by completing 
them i.e. by adding all the missing edges inside each separator. However this 
characterization gives no algorithmic information about how we should construct 
a minimal triangulation in order to minimize the cliquesize or the fill-in. 

Trying to solve this later conjecture, we studied in EniiTTi the notion of po- 
tential maximal clique. A vertex set AT is a potential maximal clique if it appears 
as a maximal clique in some minimal triangulation. In m. we characterized a 
potential maximal clique in terms of the maximal sets of neighbor separators, 
which are the minimal separators contained in it. We designed an algorithm 
which takes as input the graph and the maximal sets of neighbor separators and 
which computes the treewidth in polynomial time in the size of the input. For 
all the classes mentioned above we can list the maximal sets of neighbor sepa- 
rators in polynomial time, so we unified all the previous algorithms. Actually, 
the previous algorithms compute the maximal sets of neighbor separators in an 
implicit manner. In HH, we gave a new characterization of the potential max- 
imal cliques avoiding the minimal separators. This allowed us to design a new 
algorithm that, given a graph and its potential maximal cliques, computes the 
treewidth and the minimum fill-in in polynomial time. Moreover this approach 
permitted us to solve the two problems for a new class of graphs, namely the 
weakly triangulated graphs. It was probably the last natural class of graphs with 
polynomial number of minimal separators for which the two problems remained 
open. 

This paper is devoted to solve the ESA’93 conjecture, that is the treewidth 
and the minimum fill-in are polynomially tractable for the whole class of graphs 
having a polynomial number of minimal separators. Recall that if we are able to 
generate all the potential maximal cliques of any graph in polynomial time in the 
number of its minimal separators, then the treewidth and the minimum fill-in 
are also computable in polynomial time in the number of minimal separators. 
We define the notion of active separator for a potential maximal clique which 
leads to two results. First, the number of potential maximal cliques is polyno- 
mially bounded by the number of minimal separators. Secondly, we are able to 
enumerate the potential maximal cliques in polynomial time in their number. 
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These results reinforce our conviction that the potential maximal cliques are the 
pertinent objects to study when dealing with treewidth and minimum fill-in. 

2 Preliminaries 

Throughout this paper we consider finite, simple, undirected and connected 
graphs. 

Let G = (y, E) be a graph. We will denote by n and m the number of vertices, 
respectively the number of edges of G. For a vertex set V' C V of G, we denote 
by Ng{V') the neighborhood of V' in G\V' - so Ng(V') C V\V'. 

A subset S' C y is an a^b-separator for two nonadjacent vertices a,b £ V 
if the removal of S from the graph separates a and b in different connected 
components. S is a minimal a, b-separator if no proper subset of S separates a 
and b. We say that S is a minimal separator of G if there are two vertices a 
and b such that S is a minimal a, 6-separator. Notice that a minimal separator 
can be strictly included in another one. We denote by Ag the set of all minimal 
separators of G. 

Let G be a graph and S a minimal separator of G. We note Cg{S) the set 
of connected components of G\S. A component G G Cg{S) is a full component 
associated to S if every vertex of S is adjacent to some vertex of G, i.e. Ng{G) = 
S. The following lemmas (see [1 5| for a proof) provide different characterizations 
of a minimal separator: 

Lemma 1. A set S of vertices of G is a minimal a, b-separator if and only if a 
and b are in different full components of S. 

Lemma 2. Let G be a graph and S be an a, b-separator of G. Then S is a 
minimal a, b-separator if and only if for any vertex x of S there is a path from 
a to b that intersects S only in x. 

If G G C{S), we say that (S', G) = S' U G is a block associated to S. A block 
(S, G) is called full if G is a full component associated to S. 

Let now G = (V,E) be a graph and G' = G\V'\ an induced subgraph of G. 
We will compare the minimal separators of G and G' . 

Lemma 3. Let G be a graph and V <Z V a vertex set of G. Lf S is a minimal 
a, b-separator of the induced subgraph G' = G\V'], then there is a minimal a,b- 
separator T of G such that T C\V' = S . 

Proof. Let S' = S U (y\y'). Clearly, S' is an a, 6-separator in G. Let T be any 
minimal o, 6-separator contained in S'. We have to prove that S CT. Let x be 
any vertex of S and suppose that x ^ T. Since S is a minimal a, 6-separator of 
G', we have a path pL joining a and 6 in G' that intersects S only in x (see lemma 
Ej). But n is also a path of G, that avoids T, contradicting the fact that T is an 
o, 6-separator. It follows that S CT. Clearly, T n y' C S by construction of T, 

so T n y' = s. □ 



506 



Vincent Bouchitte and loan Todinca 



The next corollary follows directly from lemma 0. 

Corollary 1. Let G = (V, E) be a graph and a be a vertex of G. Consider the 
graph G' = G[V\{a}]. Then for any minimal separator S of G' , we have that S 
or S U {a} is a minimal separator of G. In particular, \Aq\ > 



3 Potential Maximal Cliques and Maximal Sets of 
Neighbor Separators 

The potential maximal cliques are the central object of this paper. We present in 
this section some known results about the potential maximal cliques of a graph 
(see also [laiTTlEni)- 

Definition 1. A vertex set 12 of a graph G is called a potential maximal clique 
if there is a minimal triangulation H of G such that H is a maximal clique of 
H. 



We denote by Ug the set of potential maximal cliques of the graph G. 

A potential maximal clique f2 is strongly related to the minimal separators 
contained in f2. In particular, any minimal separator of G is contained in some 
potential maximal clique of G. The number \IIg\ of potential maximal cliques 
of G is at least \Aa\ln. 

If AT is a vertex set of G, we denote by Ag{K) the minimal separators of G 
included in K . 

Definition 2. A set S of minimal separators of a graph G is called maximal set 
of neighbor separators if there is a potential maximal clique Q of G such that 
S = Ag{T 2). We also say that S borders f2 in G. 

We proved in HH that the potential maximal cliques of a graph are sufficient 
for computing the treewidth and the minimum fill-in of that graph. 

Theorem 1. Given a graph G and its potential maximal cliques IIg, we can 
compute the treewidth and the minimum fill-in of G in 0{n‘^\AG\ x Iflcj) time. 

Let now AT be a set of vertices of a graph G. We denote by Gi(A'), . . . , Gp(A') 
the connected components of G\K. We denote by Si{K) the vertices of AT ad- 
jacent to at least one vertex of Gi{K). When no confusion is possible we will 
simply speak of Gi and Si. If Si{K) = K we say that Gi{K) is a full component 
associated to AT. Finally, we denote by Sa{K) the set of all Si{K) in the graph 
G, i.e. Sg{K) is formed by the neighborhoods, in the graph G, of the connected 
components of G\K. 

Consider graph G = {V,E) and a vertex set X CV. We denote by Gx the 
graph obtained from G by completing X, i.e. by adding an edge between every 
pair of non-adjacent vertices of A. If X = {Xx, . . . , Xp\ is a set of subsets of V, 
Gx is the graph obtained by completing all the elements of X. 
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Theorem 2. Let K CV be a set of vertices. K is a potential maximal clique if 
and only if : 

1. G\K has no full components associated to K. 

2. Gs^(^k)[K] is a clique. 

Moreover, if K is a potential maximal clique, then Sg{K) is the maximal set of 
neighbor separators bordering K, i.e. Sg{K) = ^g{K). 



Remark 1. If AT is a potential maximal clique of G, for any pair of vertices x and 
y of K either x and y are adjacent in G or they are connected by a path entirely 
contained in some Ci of G\K except for x and y. The second case comes from 
the fact that if x and y are not adjacent in G they must belong to the same Si 
to ensure that K becomes a clique after the completion of Sg{K). When we will 
refer to this property we will say that x and y are connected via the connected 
component Ci. 



Remark 2. Consider a minimal separator S contained in a potential maximal 
clique L2. Let us compare the connected components of G\S and the connected 
components of G\L2 (see m for the proofs). The set G\S is contained in a full 
component Co associated to S. All the other connected components of G\S are 
also connected components of G\f2. Conversely, a connected component C of 
G\n is either a connected component of G\S (in which case Ng{C) C S) or it 
is contained in Cq (in which case Ng{C) % S). 



Remark 3. Unlike the minimal separators, a potential maximal clique C cannot 
be strictly included in another potential maximal clique Q. Indeed, for any proper 
subset Q' of a potential maximal clique Q, the difference Q\L2' is in a full 
component associated to Q' . 

TheoremElleads to a polynomial algorithm that, given a vertex set of a graph 
G, decides if iC is a potential maximal clique of G. 

Corollary 2. Given a vertex set K of a graph G, we can recognize in 0(nm) 
time if K is a potential maximal clique of G. 

Proof. We can compute in linear time the connected components Gi of G\K and 
their neighborhoods Si. We can also verify in linear time that G\K has no full 
components associated to K. 

For each x G K, we compute all the vertices y G K that are adjacent to x in 
G or connected to x via a Cj in linear time (we have to search the neighborhood 
of x and the connected components Ci with x G Si). So we can verify in 0(nm) 
time if K satisfies the conditions of theorem El □ 
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4 Potential Maximal Cliques and Active Separators 

Theorem |2| tells us that if 17 is a potential maximal clique of a graph G, then 
17 is a clique in We will divide the minimal separators of Z\g( 17) into 

two classes: those which create edges in which are called actives, and 

the others, which are called inactives. More precisely: 

Definition 3. Let f2 be a potential maximal clique of a graph G and let S G G 
he a minimal separator of G. We say that S is an active separator for Q if Q 
is not a clique in the graph obtained from G by completing all the 

minimal separators contained in 17, except S. Otherwise, S is called inactive for 

n. 

Proposition 1. Let f2 he a potential maximal clique of G and S G fi a minimal 
separator, active for 17. Let {S,Gq) he the block associated to S containing 17 
and let x,y € Q he two non-adjacent vertices of Gac{ 0)\{S}- Then f2\S is an 
minimal x, y-separator in G[Gn U {x, y}] . 

Proof. Remark that the vertices x and y, non-adjacent in GAc{f 2 )\{S}i exist by 
definition of an active separator. Moreover, since is a clique, we must 

have x,y G S. 

Let us prove first that 17\S' is a a:, y-separator in the graph G' = G[Ca U 
{x,yY\. Suppose that both x and y are in a same connected component Gxy 
of G'\(17\S'). Let G = Cxy\{x,y}. Clearly, G C Cq is a connected component 
of G\l7. Let T be the neighborhood of G in G. By theorem El T is a minimal 
separator of G, contained in 17. By construction of T, we have x,y G T. Notice 
that T ^ S, otherwise S would separate G and 17, contradicting the fact that 
G C Cn (see remark El- It follows that T is a minimal separator of Z\g( 17), 
different from S and containing x and y. This contradicts the fact that x and y 
are not adjacent in We can conclude that f2\S is an a;, y-separator 

of G'. 

We prove now that G\S in a minimal x, y-separator of G' . We will show that, 
for any vertex z G 17\S', there is a path p, joining x and y in G' and such that 
p intersects f2\S only in z. By theorem El x and z are adjacent in Gaq(s 7 ), so 
X and z are adjacent in G or they are connected via a connected component Gi 
of G\l7. Notice that Gi C Gr?: indeed, if Gi Cf 2 , then Gi will be contained in 
some connected component L> of G\5, different from Gr?. According to remark 
El we would have Na{Ci) G Nc{D) G S, contradicting z G Si. In both cases we 
have a path p' from a: to z in G' , that intersects L2\S only in z. 

For the same reasons, z and y are adjacent in G, or there is a connected 
component Cj of G\l7 such that Gj G Gn and z,y G Sj = Na{Gj). This gives 
us a path p" from z to y in G', such that p”G{f2\S) = {z}. Remark that Gi yf Gj, 
otherwise we would have a path from a; to y in Gi U{a;, y}, contradicting the fact 
that 17\S' separates x and y in G' . So the paths p' and p” are disjoint except 
for z, and their concatenation is a path p, joining x and y in G' and intersecting 
17\S' only in z. We conclude by lemma El that 17\S' is a minimal separator of 
G'. □ 
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By proposition 0 the set T' = f2\S is a minimal separator of the subgraph 
of G induced by Ca U {a;, y}. By lemmaEl there is a separator T of G such that 
T' C T and r n Gr? = T'. We deduce: 

Theorem 3. Let G he a potential maximal clique and S be a minimal separator, 
active for f2. Let {S,Cq) he the block associated to S containing fi. There is a 
minimal separator T of G such that L2 = S U {T C\ Cq). 

It follows easily that the number of potential maximal cliques containing at 
least one active separator is polynomially bounded in the number of minimal 
separators of G. More exactly number of these potential maximal cliques is 
bounded by the number of blocks {S,Ga) multiplied by the number of minimal 
separators T, so by n|Z\Gp ■ Clearly, these potential maximal cliques have a 
simple structure and can be computed directly from the minimal separators of 
the graph. 

Let us make a first observation about the potential maximal cliques contain- 
ing inactive minimal separators. 

Proposition 2. Let G be a potential maximal clique and S <Z fi a minimal sep- 
arator which is inactive for fi. Let D\, . . . , Dp be the full components associated 
to S that do not intersect fi. Then fi is a potential maximal clique of the graph 
G\ A. 

Proof. Let G' = G\ Di. The connected components of G'\G are exactly the 
connected components of G\fi, except for Gi, . . . , Dp, and their neighborhoods 
in G' are the same as in G. It follows that the set SG'{fi) of the neighborhoods 
of the connected components of G'\fi is exactly Z\g(G)\{ 5'}. Clearly, G'\fi has 
no full components associated to fi. Since S is not active for fi, we deduce that 
J7 is a clique in G'g So, by theorem 0 17 is a potential maximal clique of 
G'. "" □ 

5 Removing a Vertex 

Let G = {V, E) be a graph and a be a vertex of G. We denote by G' the graph 
obtained from G by removing a, i.e. G' = G[C\{o}]. We will show here how to 
obtain the potential maximal cliques of G using the minimal separators of G 
and G' and the potential maximal cliques of G'. By corollary 0 we know that 
G has at least as many minimal separators as G'\ for any minimal separator S 
of G' , either S' is a minimal separator of G, or S U {a} is a minimal separator of 
G. It will follow that the potential maximal cliques of a graph can be computed 
in polynomial time in the size of the graph and the number of its minimal 
separators. 

Proposition 3. Let fi be a potential maximal clique of G such that a € fi. Then 
fi' = I7\{a} is either a potential maximal clique of G' or a minimal separator 
ofG. 



510 



Vincent Bouchitte and loan Todinca 



Proof. Let C\, . . . ,Cphe the connected components of G\l7 and Si be the neigh- 
borhood of Ci in G. We denote as usual by Sa{i^) the set of all the Si’s. Remark 
that the connected components of G'\(l7\{a}) are exactly Gi, . . . , Gp and their 
neighborhoods in G' are respectively S'i\{a}, . . . , S'p\{a}. Since 17 is a clique in 
Gscin) (by theorem|2|), it follows that 17' = I7\{a} is a clique in If 

G'\l7' has no full components associated to 17', then 17' is a potential maximal 
clique of G', according to theorem O Suppose now that Gi is a full component 
associated to 17' in G'. Since Gi is not a full component associated to 17 in G, 
it follows that Ng{Ci) = 17'. Thus, 17' is a minimal separator of G, by theorem 
□ □ 



Lemma 4. Let G be a graph and G be any induced subgraph of G. Consider a 
potential maximal clique 17 of G. Suppose that for any connected component C 
of G\G, its neighborhood Ng{C) is strictly contained in 17. Then 17 is also a 
potential maximal clique of G. 

Proof. Let G be any connected component of G\G. We denote by V the set 
of vertices of G. We want to prove that 17 is a potential maximal clique of 
the graph G' = G\V U G]. Indeed, the connected components of G'\l7 are the 
connected components of G\l7 plus G. The set Sq,{Q) of their neighborhoods 
consists in {7 Vg(G)} U 5^(17). Since Na{C) is strictly contained in 17, G'\l7 has 
no full components associated to 17. Obviously 17 is a clique in ^ 

a potential maximal clique of G'. 

The result follows by an easy induction on the number of connected compo- 
nents of G\G. □ 



Proposition 4. Let G be a potential maximal clique of G such that a ^ f2. Let 
Ca be the connected component of G\f2 containing a and let S be the minimal 
separator of 17 such that S = N{Ca). 

If L2 is not a potential maximal clique of G' = G[R\{a}], then S is active for 
17. Moreover, S is not a minimal separator of G' . 

Proof. Suppose that S is not active for 17. Let D\, ... , Dp the full components 
associated to 5 in G that do not intersect 17. One of them, say I?i, is Ca. Let 
G" be the graph obtained from G by removing the vertices of Z3i U . . . U Dp. 
According to proposition El 17 is a potential maximal clique of G". Notice that 
G" is also an induced graph of G'. Any connected component G of G'\G" is 
contained in some Di, and its neighborhood in G' is included in S' = NciDi). 
Thus, Ng'{C) is strictly contained in 17. It follows from lemma 0 that 17 is a 
potential maximal clique of G', contradicting our hypothesis. We deduce that, 
in the graph G, S is an active separator for 17. 

It remains to show that S is not a minimal separator of G'. We prove that if S 
is a minimal separator of G', then 17 would be a potential maximal clique of G'. 
Let Cl, . . . .Cp, Ca be the connected components of G\l7 and let Si, . . . , Sp, S 
be their neighborhoods in G. Then the connected components of G'\l7 are 
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Cl, . . . ,Cp,C[, . . . ,C'g, with C' C Cq. Their neighborhoods in G' are respec- 
tively Si, . . . , Sp, S[, . . . , 5', with S'' C S. In particular, G'\f2 has no full com- 
ponent associated to 17 and Sg'( 17) contains every element of Sg( 17), except 
possibly S. Suppose that S is a minimal separator of G' and let H be a full 
component associated to S in G', different from Cq. By remark |2| D is also a 
connected component of G'\l7, so S = Nc{D) is an element of Sg'( 17). There- 
fore, Sg(17) C Sg'(17), so 17 is a clique in the graph G'g , (17). We can conclude 
by theorem|21that 17 is a potential maximal clique of G' , contradicting our choice 
of 17. It follows that S is not a minimal separator of G'. □ 

The following theorem, that comes directly from propositions 0 and 0 and 
theorem 0 shows us how to obtain the potential maximal cliques of G from the 
potential maximal cliques of G' and the minimal separators of G. 

Theorem 4. Let f2 be a potential maximal clique of G and let G' = G\{a}. 
Then one of the following cases holds: 

1. 17 = 17' U {a}, where fl' is a potential maximal clique of G' . 

2. fl = 17', where 17' is a potential maximal clique of G' . 

3. 17 = S' U {a}, where S is a minimal separator of G. 

4- 17 = S U (G n T), where S is a minimal separator of G, G is a connected 
component of G\S and T is a minimal separator of G. Moreover, S does not 
contain a and S is not a minimal separator of G' . 



Corollary 3. Let G be a graph, a be a vertex of G and G' = G\{a}. The number 
{nd of potential maximal cliques of G is polynomially bounded in the number 
I^G'I of potential maximal cliques of G' , the number \ Ag\ of minimal separators 
of G and the size n of G. 

More precisely, \LLg\ < \TIg' \ + n(|Z\G| - |7 \g'|)I^g| + |7\g|- 

Proof. We will count the potential maximal cliques of the graph G corresponding 
to each case of theorem 0 

Notice that for a potential maximal clique 17' of G', only one of 17' and 
17' U {a} can be a potential maximal clique of G: indeed, a potential maximal 
clique of a graph cannot be strictly included in another one (see remark 0 . So 
the number of potential maximal cliques of type 0 and 0 of G is bounded by 
I^G'I- 

The number of potential maximal cliques of type 0 is clearly bounded by 

I^gI- 

Let us count now the number of potential maximal cliques of type 0 that 
can be written as SU (T n G). By lemma 0, for any minimal separator S' of G', 
we have that S' or S' U {a} is a minimal separator of G. Clearly, the number of 
minimal separators of G of type S' or S' U {o} with S' G Z\g' is at least |7 \g'|- 
Our minimal separator S does not contain a and is not a minimal separator of 
G', so S is not of type S' or S' U {a}, with S' G Z\g'- It follows that the number 
of minimal separators S that we can choose is at most \Ag\ — |7\g'|- For each 
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minimal separator S, we have at most n connected components C of G\S and 
at most \Aq\ separators T, so the number of potential maximal cliques of type 
0is at most n(|Z\G| — |Z\g/|)|Z\( 3 |. □ 

Let now oi, 02 , Op be an arbitrary ordering of the vertices of G. We de- 
note by Gi the graph G[{ai, . . . , Oi}], so G„ = G and Gi has a single ver- 
tex. By corollary El we have that for any i, 1 < i < n, iLfCi+J < + 

n(|Z\Gi+i| - \AGi\)\^G,+A + I'^Gi+il- Notice that |Z\Gil < lAoi+A, in particular 
each graph Gi has at most |Z\g| minimal separators. Clearly, the graph Gi has 
a unique potential maximal clique. It follows directly that the graph G has at 
most n|Z\Gp -I- n|Z\G| + 1 potential maximal cliques. 

Proposition 5. The number of the potential maximal cliques of a graph is poly- 
nomially bounded in the number of its minimal separators and in the size of the 
graph. 

More precisely, a graph G has at most n\AoA + n\Aa\ -fl potential maximal 
cliques. 

We give now an algorithm computing the potential maximal cliques of a 
graph. We suppose that we have a function IS-PMC{Q, G), that returns TRUE 
if 17 is a potential maximal clique of G, FALSE otherwise. 



function ON EJAORE _V ERTEX 

Input: the graphs G, G' and a vertex a such that G' = G\{o}; 

the potential maximal cliques Uqi of G' , the minimal separators Aqi , Aq 
of G' and G. 

Output : the potential maximal cliques Uq of G. 
begin 

7Ig^0 

for each p.m.c. 17' € Uqi 

if ISJ>MC{n',G) then 
77 g ^ ilG U {17'} 

else 

if ISJ>MG{fL U {a}, G) then 
77 g ^ 77g U {17' U {a}} 
for each minimal separator S € Ac 
if ISJ>MC{S\J{a},G) then 
77 g ^ TIg U {S U {a}} 
if {a ^ S and S (f Aqi) then 
for each T G Aq 

for each full component G associated to S' in G 
if ISG>MG{S U (r n G), G) then 

tTg ^ tTg u {s u (r n g)| 

return Uq 

end 

Table 1. Computing the p.m.c.’s of G from the p.m.c.’s of G' = G\{a| 
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The function ON EJvIOREJi/ ERTEX of table ^ computes the potential 
maximal cliques of a graph G from the potential maximal cliques of a graph 
G' = G\{a}. This function is based on theorem^l The main program, presented 
in table |21 successively computes the potential maximal cliques of the graphs 
Gi = G[{ai, . . . Oi}]. The algorithm is clearly polynomial in the size of G and 
|Z\g|- The complexity proof is omitted due to space restrictions. 



main program 

Input : a graph G 

Output : the potential maximal cliques Ua of G 
begin 

let {fli, . . . , a-o} be the vertices of G 
TIgi ^ {{ai}} 

f or i = 1, n — 1 

compute Lici+i 

Lfc.+i = ONE.MORE.VERTEX{Gi,Gi+i, Ha,, Ag„ Ag,+^) 
Rg = IIg„ 

end 



Table 2. Algorithm computing the potential maximal cliques 



Theorem 5. The potential maximal cliques of a graph can be listed in polyno- 
mial time in its size and the number of its minimal separators. 

More exactly, the potential maximal cliques of a graph are computable in 
G(n^m|Z\Gp) time. 

We deduce directly from theorem [Q proposition El and theorem El 

Theorem 6. The treewidth and the minimum fill-in of a graph can be com- 
puted in polynomial time in the size of the graph and the number of its minimal 
separators. The complexity of the algorithm is 0{n^\Aa\^ -\- n'^m\Ao\'^). 
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Abstract. Distance labeling schemes are schemes that label the vertices 
of a graph with short labels in such a way that the distance between any 
two vertices can be inferred from inspecting their labels. It is shown in 
this paper that the classes of interval graphs and permutation graphs 
enjoy such a distance labeling scheme using 0(log^ n) bit labels on n- 
vertex graphs. Towards establishing these resnlts, we present a general 
property for graphs, called well-(a, g)-separation, and show that graph 
classes satisfying this property have 0{g{n) ■ logn) bit labeling schemes. 
In particular, interval graphs are well-(2, log n)-separated and permuta- 
tion graphs are well-(6, logn)-separated. 



1 Introduction 

Traditional graph representations are based on storing the graph topology in 
a data structure, e.g., an adjacency matrix, enabling one to infer information 
about the graph by inspecting the data structure. In such a context, the vertices 
of the graph are usually represented by distinct indices, serving as pointers to the 
data structure, but otherwise devoid of any meaning or structural significance. 

In contrast, one may consider using more “informative” labeling schemes for 
graphs. The idea is to assign each vertex v a label L{v), selecting the labels in a 
way that will allow us to infer information about the vertices (e.g., adjacency or 
distance) directly from their labels, without requiring any additional memory. 

In particular, a graph family IF is said to have an l(n) adjacency-labeling 
scheme if there is a function L labeling the vertices of each n-vertex graph in 
T with distinct labels of up to l{n) bits, and there exists an algorithm that 
given the labels L{v),L{w) of two vertices in a graph from IF, decides the 
adjacency of v and w in time polynomial in the length of the given labels. (Note 
that this algorithm is not given any additional information, other than the two 
labels, regarding the graph from which the vertices were taken.) 

Adjacency-labeling schemes were introduced in p<!NR,88j . Specifically, it is 
shown in [KNR,88| that a number of graph families enjoy O(logn) adjacency la- 
beling schemes, including trees, bounded arboricity graphs (including, in particu- 
lar, graphs of bounded degree and graphs of bounded genus, e.g., planar graphs). 
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various intersection-based graphs such as interval graphs, and c-decomposable 
graphs. It is also easy to encode the ancestry (or descendence) relation in a tree 
using interval-based schemes (cf. Em). 

More recently, distance labeling schemes were introduced in These 

schemes are similar to adjacency labeling schemes, except that the labels of any 
two vertices u, u in a graph G a T should enable us to compute the distance 
between u and v in G. Such schemes can be useful for various applications in the 
context of communication network protocols, as discussed in Iffl. A distance 
labeling scheme for trees using O(log^n) bit labels has been given in |L’el99j . 
This result is complemented by a lower bound proven in nmEEng, showing 
that l7(log^ n) bit labels are necessary for the class of all trees. 

The distance labeling scheme given in for trees is based on the notion 

of separators. Moreover, that scheme has recently been expanded to other graph 
classes with small separators [GPPI!99’| . In particular, we say that a class of 
graphs Q has a recursive /(n)-separator if every n-node graph G € Q has a sub- 
set of nodes S such that (1) [S'] < f{n), and (2) every connected component G' 
of the graph G\S, obtained from G by removing all the nodes of S, belongs to 
Q, and has at most 2n/3 nodes. Then it is shown in that every graph 

class G with an /(n)-separator has an 0(/(n)logn -|- log^ n) distance labeling 
scheme. This implies, for instance, the existence of an 0(-y/nlogn) distance la- 
beling scheme for planar graphs, and an 0(log^ n) distance labeling scheme for 
graphs of bounded treewidth. 

The current paper expands the study of the problem, by exploring the pos- 
sibility of designing efficient distance labeling schemes for graph classes that do 
not enjoy small separators. Specifically, we consider graph classes for which there 
exist recursive separators which are not necessarily small, but are nevertheless 
“well-behaved” in a certain sense. Intuitively, in graphs enjoying well-behaved 
separators, distances between vertices can be inferred from relatively little in- 
formation concerning the distances from every vertex to a few “representative” 
vertices in the (potentially large) separator. This property can be guaranteed, 
for instance, in graph classes enjoying small diameter separators. 

Towards making this intuition more precise, we introduce the notion of well- 
{a, g)- separated graph classes, which are graph classes enjoying separators with 
some special properties. For any well-(a, g)-separated graph class, we construct 
a distance labeling scheme with 0{g{n) ■ logn) bit labels. 

We then demonstrate the applicability of our construction technique by es- 
tablishing the fact that the families of interval graphs and permutation graphs 
are all well-(A:,logn)-separated for some constant k, so they all have 0(log^ n) 
bit distance labeling schemes. (Note that in both families, separators might be 
of size n{n).) Due to space considerations, we present here only the scheme for 
interval graphs. (The scheme for permutation graphs can be found in IKKF99I .1 
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2 The Well-Separation Property 

A well-(a, g)-separated graph family consists of graphs that can be divided by a 
special separator, into subgraphs of size at most n/2, so 0(log n) labels suffice for 
calculating the distances in the separator, and from every vertex in the subgraphs 
to the separator. 

Let us first define the following terminology. Consider a graph G = (V,E). 
For every v,w G V, let dist{v, w, G) denote the distance between v and w 
in G, namely, the length of the shortest path between them. We abbreviate 
dist{v,w,G) as dist{v,w) whenever G is understood from the context. 

For a subset U C V, let dist{v, U) denote the minimum distance between v 
and any w G U, and let distjj{v, w, G) be the shortest path between v and w in 
G, that passes through at least one vertex of U . 

Assume each vertex v GV has a unique identifier I{v). For every v,w GV, 
define M{v,w) = {I{w),dist{v,w)). For every vertex v G V and subset D = 
{di , . . . , dt} C V, M{v, D) is defined to be the t-tuple {M{v, di), . . . ,M{v, dt)). 

Given an n-vertex graph G, a separator is a non-empty set of vertices whose 
removal breaks G into (zero or more) subgraphs with no interconnecting edges 
between them, each with at most n/2 vertices. 

Let us now define a separation property which exists in some natural graph 
families, and whose existence is later used as the basis for the design of a distance 
labeling scheme. 

Well-separation: A graph family Q is well-{a, g)-separated for an integer a > 

0 and a function (/ : IN i— > IN, if there exists an identifier function / assigning 
unique identifiers to the vertices of every graph in Q, and for every n-vertex 
graph G in Q, there exists a set of vertices C, called the a-separator of G, with 
the following properties: 

1. Deleting G from G disconnects it into (zero or more) subgraphs Gi, . . . , Gm, 

with no interconnecting edges between them, such that for every 1 < z < to: 

(a) \V{Gi)\<n/2. 

(b) Gi G G (hence in particular Gi is well-(a, g)-separated). 

2. For every v € V{G), the identifier I{v) is of size g{n). 

3. There exist polynomial time computable functions and 

and for every v G G there exists a reference set of vertices = d/ , . . . , d/, C 

V{G), such that given 

av,w,G) = 

(a) the function computes the distance between two vertices in the sep- 
arator. 

Formally, for v,w gG, /®^(^(w, w, G)) = dist{v,w,G). 

(b) the function computes the distance between two vertices one of which 
is in the separator and the other is not. 

Formally, for v G V{Gi),w G G, /®®(^(z;, zc, G)) = dist{v,w,G). 
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(c) the function computes the distance between every two vertices that 
are not in the same subgraph, and are not in the separator. 

Formally, for v C V{Gi) and w G V{Gj), i ^ j, w, G)) = 

dist{v, w, G). 

(d) the function computes for every two vertices in the same subgraph, 
the length of the shortest path between them, that passes through at 
least one of the vertices of the separator. 

Formally, for v,w G V{Gi), w, G)) = distc{v,w,G). 

3 A Labeling Scheme for Well-Separated Graphs 

This section describes a distance labeling scheme for n-vertex graphs G taken 
from a well-(a, 5)-separated graph family. The construction makes use of the 
a-separator of the graph G. We assume that G is a connected graph. Otherwise, 
if G is not connected, we treat each component separately, and add an index to 
the resulting label of each vertex to indicate its connected component. 

3.1 The Labeling System 

The vertices of a given well-(a, 5)-separated graph G = (V,E) are labeled as 
follows. As a preprocessing step, calculate for every vertex v G V the identifier 
7 (n) whose existence is asserted by the well-separation property. 

The actual labeling is constructed by a recursive procedure Assign_Label, 
that applied to G, returns the label L{v) of every vertex v G V. The procedure, 
based on recursively partitioning the graph by finding a-separators, is presented 
in Fig. □ The procedure generates for every v G G a label of the form 

L{v) = Ji{v) o ... o Jq{v) o I{v) . 



1. Let C be an a-separator of G. 

2. For every vertex v G C set L{v) ^ {M{v, 0) o I{v) 

3. If G 7 ^ V{G) then do: 

(a) Delete G from the graph and partition G into m mutually disconnected 
subgraphs, Gi . . . Gm 

(b) For every 1 < t < m do: 

i. Recursively invoke AssiGN_LABEL(Gt) to get L'{v) for every v G Gt 

ii. For every vertex v G Gt do: 

I. Find C C 

II. J{v) = {M(v,b^’^),t) 

III. L{v) = J{v)oL'(v) 

4. Return L(v) for every v G V(G). 



Fig. 1. Algorithm Assign_Label(G). 



520 



Michal Katz, Nir A. Katz, and David Peleg 



3.2 Computing the Distances 

Let us next describe a recursive procedure Dist_Comput for computing the 
distance between two vertices v, w in G. Consider two vertices v, w in G, with 
labels 

L{v) = Ji{v) o . . . o Jp(v) o I(v) and L{w) = Ji{w) o . . . o Jg(w) o I(w) , 
respectively, where 

J,{v) = forl<i<p, 

and 

J,{w) = forl<i<g. 

Note that during the first few partitions, the vertices v and w may belong 
to the same subgraph. Let = G. Suppose both v and w belong to the same 
subgraph of G* on level i, for every 1 < f < fc, but the graph G^ on level fc, 
while still containing both of them, is separated by G^ into subgraphs in such a 
way that v and w are not in the same subgraph. 

Procedure Dist_Comput receives the labels L{v) and L{w) of the vertices v 
and w. It starts by calculating the first level k on which v and w both belong to 
G^ but are no longer in the same subgraph of G^ . (This is indicated by the fact 
that in the fcth fields of the two labels, or = 0 or = 0 .) Let G® be 

the separator that separates G®, for every 1 < i < k. The procedure calculates 
the length of the shortest path pi{v,w) between v and w that goes through the 
level-i separator G®, for every 1 < f < A:. It then calculates also the length of 
the shortest path connecting them in the subgraph G^. Finally, it returns the 
minimum of these lengths. The procedure is described formally in Figure El 

3.3 Analysis 

We now analyze the correctness and cost of the resulting distance labeling 
scheme. (Some proofs are omitted; see |KKPH9| .i 

Lemma 1. In a well-{a, g)-separated graph G, if C is an a-separator ofG, and 
the vertices v,w are in the same subgraph Gi of G induced by C, then 

dist{v,w,G) = min{dist{v,w,Gi),distc{v,w,G)}. 

Lemma 2. For every well-{a, g)-separated graph G, and vertices v,w gV, the 
output of Algorithm Dist_Comput is dist{v,w,G). 

Proof. Consider an arbitrary pair of vertices v,w GV, with labels 

L{v) = Ji{v) o ... o Jp{v) o I[v) and L{w) = J\{w) o . . . o Jq{w) o I{w) 

For every 1 < z < minjp, g}, let J.i{v) = (M(u, D®'’*^’), t") Ji{w) = {M{w, 
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1. If L{v) = L{w) then Return 0 

2. Let k be the minimum index such that t™ or = 0 or t™ = 0 . 

/* V and w belong to the same subgraph G* on each level 1 < i < fc — 1, 
and are separated on level k. */ 

3. Initialize ui) <— 00 

4. For i = 1 to fc — 1 do: 

(a) Let C(«,w,G‘) = 

/* G* is the subgraph containing both v and w on level i * j 

(b) Dist’^^^{v,w) <— min{Dist®®P(u, ui), w, G*)} 

/* Calculate the length of the shortest path between v and w 
that passes through one of the separators */ 

5. /* On the fcth level */ 

Let C(w,«’,G'“) = (7 (u),7(w),M(u,D’'’^''),M(m;,D“’®'“)). 

(a) If = t™ = 0 /* Both V and w are in the separator on level k * / 
then Return DistInGk ^ f‘’‘’(^{v,w,G^)) 

(b) Else if = 0 /* Only v is in the separator */ 

then Return DistInGk <— {^(w,v,G’‘)) 

(c) Else if tfc =0 /* Only w is in the separator */ 

then Return DistInGk <— ui, G*’)) 

(d) Else if t^. 7 ^ t™ /* V and w are in different level k subgraphs of G */ 
then Return DistInGk <— /®®(^(u, w, G*’)) 

6 . Return mm{DistInGk, DisD^^{v,w)}. 



Fig. 2. Algorithm Dist_Comput {L{v),L{w)). 



Let 1 < fc < min{p, q} be the minimum index such that v and w belong to the 
same subgraph on each level 1 < i < fc — 1, and are separated on level k or 
both in the separator of this level. For this k, one of the conditions (a)-(d) in 
the procedure Dist_Comput must hold. 

Let us examine the cases considered by the procedure one by one. If = = 

0, then both v and w are in the separator C^, and according to the definition of 
the function /®®, dist{v,w,G^) = w, G^)) 

If = 0 and 7^ 7^ 0, then v is in the separator G^, and w is in one 
of the subgraphs of G^ on level k, and by the definition of the function /®®, 
dist{v,w,G^) = w, G^)), The case 7™ = 0 and 7^ 0 is analogous to 

the previous case. 

Finally, if 7^ yf 7“ y7 0, then v and w are in different subgraphs of G^ on level 
k according to the separator G^, and according to the definition of the function 
dist{v,w,G^) = w, G^)). 

The last step of the procedure returns the minimum of dist{v,w,G^) and 
of 777s7®®P(u, u>), which holds the length of the shortest path between v and w 
through one of the higher levels separators G*, for 1 < i < fc — 1. By Lemma 0 
the minimum of dist{v,w,G^) and of 777 s7®®p(u, ic), is the distance between v 
and w. I 
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Lemma 3. The labeling scheme uses 0(g(n)logn) hit labels. 

Proof. On each level of the recursive labeling procedure, the sublabel ff{v) is of 
the form {M {y , D'") ,P) . The index P clearly requires at most logn bits. 

M{v,i)"’) = (M{v,ch), . . . , M{v,da)) = {{I{di),dist{v,di)), . . . ,{I{da),dist{v,da))) 

is of size 0{a-g{n)+a log n) = 0{g{n)) as g{n) > log n. Since the maximum graph 
size is halved in each application of the recursive labeling procedure, there are at 
most log n levels, hence the size of the labels after the recursion is 0{g{n) ■ log n). 
Finally, the size of the initial identifiers assigned at the preprocessing step is 
|/(u)| = g{n), Which is negligible. | 

Theorem 1. Any well-{a, g) -separated graph family has a distance labeling 
scheme of size 0{g{n)logn). 

4 Distance Labeling Scheme for Interval Graphs 

4.1 Definitions 

We need to introduce some preliminary definitions concerning interval graphs. 
Given a finite number of intervals on a straight line, a graph associated with 
this set of intervals can be constructed in the following manner. Each interval 
corresponds to a vertex of the graph, and two vertices are connected by an edge 
if and only if the corresponding intervals overlap at least partially 

Let G = (V, E), be a connected n- vertex interval graph. For every v € V, we 
use the interval representation T{v) = [/(u),r(u)] where l{v) (respectively, r(u)) 
is the left (resp., right) coordinate of v. 

Example: Figure 0describes the intervals corresponding to an 8- vertex inter- 

val graph Gint and Figure 0| describes the intervals corresponding to an 6- vertex 
interval graph Gpath, which will be used throughout what follows to illustrate 
our basic notions and definitions, q 



V, 




^5 



^4 






^6 








Fig. 3. An 8-vertex interval graph Gint in its interval representation. 



For every real a; G R, let Vl{x) denote the set of all vertices whose intervals 
end before x (scanning the intervals from left to right), let Vr{x) denote the set 
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of vertices whose intervals start after x and let C{x) denote the set of vertices 
whose intervals contains x, i.e., 

Vl{x) = {v & V \ r{v) < x} , 

Vr{x) ={v & V \ l{v) > a;} , 

C(x) = {v €V \ l{v) < X < r(v)} . 

Example (cont.): In Figure 0 ^l(x) = {vi,V 2 ,V 3 }, Vr{x) = {uejUyjUg} and 
C{x) = {U4,?^5}- □ 

For every set of vertices U, let L{U) be the leftmost vertex in U according 
to the left endpoint of its interval, that is, L{U) = v if v G U and l{v) < l{w) 
for all w G U. The rightmost vertex of U, R{U), is defined in an analogous way, 
for the right side. 

For every vertex v G Vr{x) to the left of x, we identify a special contact 
vertex for v in C = C{x), denoted Far{v,C), which is the “rightmost” possible 
vertex in C (w.r.t. its right endpoint) still within distance dist(v, C) from v. 
Formally, for every v G Vr{x) and C C V, Vr(x) n C = 0, Far(v,C) is the 
vertex satisfying Far{v,C) £ C, dist{v, Far{v,C)) = dist{v , L{C)) , and for 
every w G C, dist{v, w) = dist{v, L{C)) => r{Far{v, C)) > r{w). We abbreviate 
this as Far(v) whenever C is understood from the context. 

Our algorithm also makes use of a vertex slightly closer to v than Far{v, C); 
let Far~ (u, C) be the rightmost vertex in Vr{x) that can be reached from v in one 
step less than Far{v,C), i.e., dist{v, Far~ (v,C)) is the distance between v and 
C minus 1. Formally, for every v G Vr{x), Far~{v,C) is the vertex satisfying 
dist{v, Far~ (v)) = dist{v,L{C)) — 1, and for every w G Vl{x), dist{v,w) = 
dist{v,L{C)) — 1 r{Far~{v,C)) > r{w). Again, we use Far~{v) for short 
whenever no confusion may arise. 

For every v G Vr{x)^ define Far{v) and Far~{v) in an analogous way: 

1. Far{v, C) is the vertex satisfying Far{v, C) G C, dist{v, Far{v, C)) = dist{v, 
R{C)), and for every w G C, dist{v,w) = dist{v,R{C)) => l{Far{v,C)) < 
l{w). 

2. Far~{v,C) G V is the vertex satisfying dist{v,Far~ {v,C)) = dist{v,R{C)) — 
1, and for every w G Vr{x), dist{v,w) = dist{v, R{C)) — 1 => l{Far~{v,C)) < 
l{w). 

Finally, let S{v) = dist{v,Far{v)) and p{v,w) = S(v) + S(w). 

Example (cont.): In Figure 0 if C = {^4,^5} then L(C) = V 4 and R(C) = 

V5. Given this separator, Far{v\) = V5, Far(vg) = V4, Far~(vi) = V3 and 
Far~{vs) = vq. Therefore i5(ui) = 3, 6{vs) = 3 and p{vi,vs) = 3 -I- 3 = 6 . q 



4.2 Well-(2, log n)-Separation of Interval Graphs 

Let us show that interval graphs are well-(2,logn)-separated. Let us start with 
a high-level overview of our method. The separator C consists of the “middle” 
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intervals in the graph. For enabling distance calculations, the labels of all the 
other intervals in the graph must encode the distance to the nearest interval in 
the separator C, and also to the immediate previous interval in the path to C. 

The distances between the intervals correspond to the distances between 
vertices in the graph. 

A useful property of this kind of separator is that the vertices in it create a 
clique in the graph, so the distance between them is 1. Therefore, the distance 
between two intervals from different sides of the separator C will be either the 
sum of their distances to C, or, in case there is no overlap between these paths, 
the sum of the distances plus 1. 

If the intervals are in the same side of C,then their distance is calculated 
recursively, as we continue to divide the subgraph by separators, and calculate 
the distances to these new separators. 

For calculating the distance from an interval T(w) in C to an interval I(v) 
that is not, we check the distance from I{v) to C\ this is the distance between 
the two intervals if T{w) overlaps the path of I{v) to C. If there is no such 
overlap, the distance increases by 1. 

Let us now proceed with a more formal description of the construction. 

The separator: Given a graph G = (V,E), we choose the separator C as 

follows. Choose a real x such that |VL(a;)| < n/2 and |yR(x)| < n/2. Let C = 
{v & V \ l{v) < X < r(u)}. 

Example (cont.): In Figure m C = {V4,V5}. □ 

The identifiers: Define the identifier I{v) as follows. For every v G V, set 

I(v) = {K{v),l{v),r{v)) where K(v) is a distinct number between 1 to n. The 
size of I{v) is O(logn). 

The reference sets: The set is defined as: 



= iHv),I{w), 

{{I{Far{v)),dist{v, Far{v))), {I{Far~ {v)),dist{v,Far~ (v)))), 
{{I{Far{w)), dist{w, Far{w))), {I{Far~ (w)), dist{w, Far~ {w))))) 

{{I{Far{v)),5{v)), {I{Far~ {v)),6{v) - 1)), 

{{I{Far{w)),S{w)), {I{Far~{w)),S{w) - 1))) . 




0 



{Far{v),Far (v)) , v G V{G) \ C, 



otherwise. 



Hence the tuple ^{v, w, G) for v G V{G) \ C becomes 
C(u,u;,G) = (J(u),/H,M(u,D"’'=),MKD“’'=)) 



The distance functions: Finally, we define the functions /®®, /®®, and 
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— For v,w £ C, V ^ w, f^^{^{v,w,G)) returns 1. 

— For V £ Vl{x) and w £ C, f^^{^{v,w,G)) is computed as follows. 

Extract M{v, and calculate (5(u). 

If l{w) < r{Far~ (v)) then return S{v), 

Else return 6{v) + 1. 

— For V £ Vl(x) and w £ Vr{x), f^^{^{v,w,G)) is computed as follows. 

Extract M{v,D'"’^) and M(w, from w, G), and calculate p{v,w). 

If l{Far~ (w)) < r(Far(v)) or l{Far{w)) < r{Far~ {v)) 

then return p{v,w), 

Else return p{v,w) + 1. 

— For v,w £ Vl{x), u>, G)) is computed as follows. 

Extract M{v,D'^'^) and M{w, from ^{v,w,G). 

Return p{v, w). 



4.3 Correctness Proof 

We next show that the functions and defined above obey 

the requirements of the well-separation property. As a consequence, Algorithm 
Dist_Comput calculates the right distance between every two vertices. To sim- 
plify the scheme, we comment that the function is not necessary for interval 
graphs, because in these graphs the distances grow monotonely, so for any two 
vertices in a subgraph G', the shortest path between their internals into G' is 
never shorter than the shortest path that contains also vertices from outside G'. 
So the distances according to the scheme are calculated in a similar way, except 
the use of the function /s®®. The return value of Algorithm Dist_Comput is 
always the return values of one of the functions and 

We rely on the following basic property of interval graphs. 

Fact 2 |Goi80] Any induced subgraph of an interval graph is an interval graph. 



Lemma 4. C is a separator. 

Thus, for every interval graph G = {V,E), Vl{x) and Vr{x) are interval 
graphs and therefore if G is well-separated, Vl{x) and Vr{x) are well-separated 
too. 

By the definition of Far(v) and Far~{v) we have: 

Lemma 5. For every v £ V , 

1. S(v) = dist(v,Far(v)) = dist(v, Far~ (v)) + 1, 

2. dist{Far{v),Far~ {v)) = 1. 

Our analysis makes use of the following technical lemma. 

Lemma 6. For every v £ Vl{x) and w £ Vr{x), dist{v,w) > p{v,w). 
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Lemma 7. For every v G Vl{x) and w G Vr{x) such that l{Far~(w)) < 
r{Far{v)), the corresponding intervals of Far{v) and Far~{w) have a point 
in common. 



Lemma 8. For every v,w gC, dist{v,w) = f‘‘^{f{v,w,G)). 



Lemma 9. For every v G Vl{x) and w G C, dist{v,w) = w, G)). 

Proof. There are two cases to consider. If l{w) < r{Far~ (v)), then the claim 
holds because the overlap between the intervals T{Far~ {v)) and ’I{w) implies 
that 

dist{v,w) < dist{v, Far~ (v)) + dist{Far~ (v),w) = (< 5 (z)) — 1) + 1 = 

and on the other hand w G C, and thus dist{v,w) > 5 {v). Hence dist{v,w) = 
S(v) which is the value returned by in this case. 

Otherwise, if l(w) > r(Far~ (v)), then the claim holds because 

dist{v,w) < dist{v,Far{v)) + dist{Far{v),w) = J(t;) + 1 , 

and on the other hand there is no overlap between the intervals F{w) and 
2 {Far~ (v)), so by the properties of Far{v) and Far~(v), we have dist{v,w) = 
S{v) + 1, which is the value returned by in this case. | 

Example (cont.): In Figure 0 for v = vi and w = V4, the first case occurs, 

as Far{v\) = V5 and Far~{v\) = V3. I{v4) < r{Far~{vi)) = r^v^). i5(ui) = 3, 
therefore dist{vi,V 4 ) = 3. 

In Figure^ for v = v\ and w = V4, the second case occurs. Far{v\) = U 3 and 
Far~{v\) = V2. I(v4) > r(Far~(vi)) = r(v2). d(vi) = 2, therefore dist{v\,V4) = 
<5(fi) + I = 3. □ 



V. 






V, 



V, 



^5 







Fig. 4. An interval representation of a 6-vertex path Gpath- 



Lemma 10. For every v G Vl{x) and w G Vr{x), dist{v,w) = f^^{^{v,w,G)). 
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Proof. There are four cases to be examined. When Far(v) = Far(w), by the 
triangle inequality, 

dist{v,w) < dist{v,Far{v)) + dist{Far{w),w) = p{v,w). 

By Lemma w) > p{v,w), so dist{v,w) = p{v,w), and in this case, 

returned the right distance. 

Now, assume Farfv) ^ Far(w). When l{Far~ (w)) < r{Far(v)), by Lemma 
0 the intervals T{Far(v)) and 2{Far~ (w)) have a point in common, in particu- 
lar, the point l{Far~{w)), therefore dist{Far{v), Far~{w)) = 1. By the triangle 
inequality, 

dist{v, w) < dist{v, Far{v)) + dist{F ar{v) , Far~ (w)) + dist(Far~(w), w) 

= i5(u) -I- 1 -h ((5(w) - 1) = p{v,w) . 

By LemmaEl dist{v,w) = p{v,w). 

The case when l{Far{w)) < r{Far~ (v)) is handled in the same way. 

The remaining case is when Far(v) yf Far(w), l{Far~(w)) > r{Far(v)) 
and l{Far{w)) > r{Far~ (v)). Because there is no overlap between the intervals 
X{Far{v)),l{Far~ (w)) and between the intervals l{Far{w)),X{Far~ (v)), we 
have dist{Far~{v),Far{w)) = 2 and dist{Far{v),Far~{w)) = 2, thus 

dist{v,w) = dist{v, Far{v)) + dist{Far{v),Far{w)) + dist{Far{w),w) = 
p{v,w) + l. I 

Example (cont.): In Figure 0 if '^ = and w = vs, then Far{yi) = U5, 

Far{vs) = V4, Far~{vi) = V3 a,ndFar~(vs) = vq. In this case, Far{v) Far{w) 
and l{Far~{w)) < r{Far{v)). <5(ui) = 3, = 3, and p{vi,v%) = 3 -I- 3 = 6. 

l{Far~ (vs)) < r{Far{vi)) therefore dist{vi,vs) = 6. 

In Figure 0 for v = v\ and w = vq. Here, Far{v) y^ F ar{w) f{F ar~ {w)) > 
r{Far{v)) and l{Far{w)) > r{Far~ (v)), because Far{vi) = U3, Far{ve) = V4, 
Far~{v\) = V 2 and Far~{ve) = V 5 . i5(ui) = 2, S{ve) = 2, and p{vi,ve) = 
2 + 2 = 4:. Far{vi) y^ Far{vo), l{Far~{vo)) > r{Far{vi)) and l{Far{vQ)) > 
r{Far~{vi)). therefore dist{vi,vo) = p{vi,vo) -I- 1 = 7. q 
A s consequence of Lemmas 0 0 0and ca we get 

Corollary 1. The class of interval graphs is w ell-{2, log n)- separated. | 



Theorem 3. The class of interval graphs enjoys an O(log^n) distance labeling 
scheme. | 
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Abstract. Given a graph, removing pendant vertices (vertices with only 
one neighbor) and vertices that have a twin (another vertex that has the 
same neighbors) until it is not possible yields a reduced graph, called the 
“pruned graph” . In this paper, we present an algorithm which computes 
this “pruned graph” either in linear time or in linear space. In order to 
achieve these complexity bounds, we introduce a data structure based on 
digital search trees. Originally designed to store a family of sets and to 
test efficiently equalities of sets after the removal of some elements, this 
data structure hnds interesting applications in graph algorithmics. For 
instance, the computation of the “pruned graph” provides a new and sim- 
ply implementable algorithm for the recognition of distance-hereditary 
graphs, and we improve the complexity bounds for the complete bipartite 
cover problem on bipartite distance-hereditary graphs. 



Keywords: graph algorithms, distance-hereditary graphs, sets, digital search 
trees, amortized complexity. 



1 Introduction 

Distance-hereditary graphs have been introduced by E. Howorka 
in 1977. Originally defined as graphs in which every induced path is isometric, 
many other characterizations have been found for this class: forbidden subgraphs, 
properties of cycles, metric characterizations . . . (see fRM86j for a survey) . We fo- 
cus on one of them: G is a distance-hereditary graph iff every subgraph of G with 
at least two vertices contains a pendant vertex (a vertex with only one neigh- 
bor) or a vertex that has a twin (another vertex that has the same neighbors). 
Thus repeatedly removing from a graph G a pendant vertex or a vertex that has 
a twin, whenever possible (a process called “to prune the graph”), leads to a 
graph reduced to one single vertex iff G is a distance-hereditary graph. 

This pruning process has a special property, which enables us to define a 
reduction on graphs thanks to the following theorem: given a graph, whatever 
the order is chosen to remove pendant vertices and vertices with twins, leads to a 
unique graph, up to an isomorphism. We will call this reduced graph the pruned 
graph and the problem of computing the “pruned graph” of any graph appears 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 529-E^3 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 



530 Jean-Marc Lanlignel, Olivier Raynaud, and Eric Thierry 



as a generalization of the recognition of distance-hereditary graphs (graphs with 
a single vertex as “pruned graph” ) . 

Figure Q] shows a sequence of deletions which leads to the “pruned graph”. 




Fig. 1. Pruning process 



The aim of this article is to provide algorithms to compute the pruned graph 
of any graph, with best upper bounds on the complexity. To give a first bound, 
note that if the graph is given by its adjacency lists, the algorithm which consists 
of finding pendant vertices and twins by comparing adjacency lists and removing 
these vertices one by one, would lead to an 0(mn) time complexity, where n is 
the number of vertices and m the number of edges. Our algorithms are based on 
another representation of the graph: instead of keeping the adjacency lists, we 
consider the neighbourhoods of the vertices as a family of subsets on the set of all 
vertices, and we work on the digital search tree associated to this family. Storing 
the graph in such a data structure, also called trie, enables us to perform, with 
good complexities, the deletion of one element in all the subsets, the deletion of a 
subset (corresponding to the deletion of a vertex) , as well as the detection of equal 
subsets (corresponding to the detection of twin vertices). We will present three 
slightly different variants of this data structure and the associated algorithms, 
which lead to the following upper bounds for the computation of the pruned 
graph (n is the initial number of vertices, m the initial number of edges): 



Time bound 


m + n 


TO log n 




Space bound 




TO log n 


m + n 



In section 2, all the operations we need in order to compute the “pruned 
graph” are restated in terms of sets and operations on sets. We choose this point 
of view because our algorithms and the complexity analyses we give apply to 
any family of sets. 

In section 3, we give the applications to graph problems. The computation 
of the “pruned graph” provides a new algorithm for the recognition of distance- 
hereditary graphs, with the complexities mentioned above, depending on which 
variant is chosen. They are not linear in both time and space, but the only 
known linear algorithm is very recent and complex (for instance it uses the 
modular decomposition of cographs as a subroutine), to the point that a first 
release was not fully functional m na). 
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Willing to take advantage of the graph structure to have faster algorithms, we 
also prove that some good orderings of the vertices yield a linear algorithm both 
in time and space in some special pruning cases, for instance the computation 
of the “pruned graph” of bipartite graphs. 

As a final example, we improve the complexity bounds for the complete 
bipartite cover problem on bipartite distance-hereditary graphs, with a linear 
algorithm, whereas the only known algorithm proposed in n u runs with a 

0{mn) time complexity. 



2 Data Structure 

2.1 The General Problem 

Let G = (P, E) be a non oriented graph, lix N{x) = {y &V,y ^ x\{x, y) G 
E} is the open neighborhood of x in G, and Nc{x) = A/’(x)|J{a;} its closed 
neighborhood. An x G V such that ^Af{x) = 1 is called a pendant vertex. 
Some x,y G V such that Af{x)\{y} = Af{y)\{x} are called twin vertices. 
Thus twins share the same vertices as neighbors, and may or may not be linked 
together. We will call non-adjacent twins true twins, and adjacent twins false 
twins . 

To compute the “pruned graph” of G, we mainly have to detect false twins, 
namely vertices x,y such that N{x) = N{y), and true twins, namely vertices 
x,y such that Nc{x) = Afdy). The detection of pendant vertices only needs a 
data structure to keep track of the degree #A/’(x) of each vertex x. Then we 
need to delete vertices from the graph, and try to detect new removable vertices, 
and so on. 

This process may be expressed in another way. Let us consider the family of 
sets {A/"(a;)|a; G P} lJ{A/’c(a:)|a; G P} which completely characterizes the graph G. 
We mainly have to detect equal sets to detect twins (note that for any x,y, 
Nc{x) = N{y) is never possible). After the deletion of a vertex x, we have to 
update the family of sets by deleting the sets N{x) and Nc{x), and suppress all 
the occurrences of x in the other sets. In the whole section 2, we deal with these 
operations on sets, and then we describe the complete algorithm on graphs in 
section 3. We now reformulate our main problem in terms of sets. 

Let A be a set of cardinality n. Suppose we are given k sets on A, S\, . . . ,Sk. 
The problem is to find a data structure that will enable us, with the best amor- 
tized complexity, to perform the following operations: 

Operation Equality: Find (i,j) such that Si = Sj, i yf j, or detect that all 
sets are different. 

Operation Set delete: Given i, delete Si. 

Operation Element delete: Given a; G A, delete all occurrences of x from all 
sets. 
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2.2 General Survey of the Data Structure 

The input thus consists of k sets, Si,. . . , Sk, Si C X, = n. Let k be the 
cardinal of Si, or its length if we view sets as words. Let d be the size of the 
input: we suppose that 



k 

de f2{k + '^h). 

i=l 

This is obviously true if the sets are given as lists of elements. In particular, in 
our pruning graph problem, d € 0{m + n) and k = n. 

We need an order on X, since sorting will be involved. For the general results 
there is no need to compute a particular order, so the implicit order (from the 
input) will do. But in some particular cases we are able to guarantee linear time 
and space algorithms if we compute a well-chosen order. 

Note that the order identifies X and |l,n]: we can identify the sets on X 
(e.g. the Si’s) with the words of |l,n] whose letters are sorted. 

As a second step, we store in a trie (a tree coding words thanks to labelled 
edges IIKnu m the words with letters sorted that represent our sets. We call this 
trie a lexicographic tree. This structure is best understood with an example: 
Fig. □ shows how each edge is labelled with an element of X, and some nodes 
with some of the Si’s. Note that the edges are sorted from left to right; that 
each path from the root meets edge labels in increasing order; and that when 
Si labels a node, then the path from the root to this node is exactly Si. 



= {1,2,3,4} 

52 = {1,3} 

53 = 12,3,5} 

54 = {2, 3} 

5s = {1,3} 

subsets of 
A = {1,2, 3, 4, 5} 

Fig. 2. A lexicographic tree 

No two edges with the same parent bear the same label. Thus for a given i 
exactly one node is labelled with Si, and this node is also labelled with all Sj’s 
equal to Si. Note also that all leaves correspond to at least one Si, and that 
there are 0{d) edges. 

The name of the structure comes from the observation that the leaves are 
lexicographically sorted. 

Our data structure for the tree will have some variants, because the coding of 
the tree structure (parent, children and siblings) is a key point for the complexity 



s, 

O 



Edge label 




Node labels 
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analysis, and different choices for kinship will lead to different complexities in 
time and space. 

Concerning the implementation, the basic item is an edge together with the 
node it leads to. When speaking of the data structure we call this a nodge (see 
Fig. EJ. It holds the following information: 

— A 0 (for instance) marking this structure as being a nodge. 

— Two pointers at sets, the “first” and the “last” sets labelling this node. See 
below how the node labels in our tree are organized. 

— The label a; G A of its edge (say 0 for the root nodge). 

— Pointers at the next and previous nodges with the same edge label (or NIL 
if this is the last nodge). The nodges with the same edge label are thus 
organized in a doubly linked list. 

— A pointer at the parent nodge (if this nodge is not the root). 

— Some more coding of the tree structure for kinship, to be described later. 




Circular linked list of 
the node label. 




Some set 



Kinship information. 



Edge label x . 

(0 for the root) 

Linked list of the nodge 
with edge label x 

Pointer at the parent nodge 



Fig. 3. A nodge 



There are two kinds of sets: the ones that are first label of their nodge, and 
the others, that are duplicates of the firsts. The duplicates are held in a global 
doubly linked list, the entry of which we call D. If all the nodges of the tree are 
labelled by at most one Si then the list D will be empty. 

The information concerning the sets is stored in an array with k entries, with 
at entry v. 

— An i marking this entry to be concerning Si (to tell this entry from a nodge), 

— Two pointers at the next and previous equal sets (or to the nodge it labels), 
as explained. 
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— Two pointers for the doubly linked list D if Si is a duplicate (say two NIL’s 
if not). 

Finally we need an array of size n for the letters; the entry number x simply 
consists of a pointer at the first nodge of edge label x. 

2.3 Operations 

We first have to build the lexicographic tree from sets given by lists of elements: 
we call that the initialization. This can be done in linear time and space, 
subject to the choice of the kinship coding : generate the couples {i, x) for x G St, 
bucket sort them according to x and build the tree from the root (see [W WWj 1 . 
We will check in section 2.4 that the extra time and space needed to initialize 
the kinship coding remains in suitable bounds. 

We will need, as a base step for the operations we eventually want, to know 
how to delete a nodge. Given a nodge N, we have to delete it, and for every child 
to put it among N’s siblings, with possible merging, which leads to recursive 
sub-tree merging. Therefore the low-level operation is the merging of a nodge 
into another one. An edge can be deleted for two reasons: either through direct 
deletion (delete-nodge), or because of a merging into another edge (recursive 
calls to merge-nodges). FigureElshows the deletion of a nodge, and the resulting 
merging of nodges. 




Fig. 4. Deletion of a nodge 

These two small procedures (algorithm ^ and 13) are self-describing. Only 
the “critical loop” of merge-nodges needs a more thorough description and 
a complexity analysis, since it strongly depends on the data structure to be 
described later. Everything else leads to 0(1) operations per deleted nodge, 
that is, 0{d) for all delete-nodge calls. The global operations are now easy to 
describe. 

Operation Equality: Use the duplicates list D to know whether there are some 
equal sets, and if so, to find two equal ones; 
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Operation Set delete: Remove the set as a node label. Perhaps this creates 
a leaf without a node label, and a branch is to be deleted. Each nodge deleted 
this way (with a delete-nodge call) costs only 0(1) time, since it is childless; 

Operation Element delete: While there are some nodges of the given edge 
label, call delete-nodge . 



Algorithm 1: delete-nodge (N) 

Let N' be the parent of N 

Remove N from the children of N' (removing a child given by a pointer 
will be an 0(1) operation) /^Removes N from the tree*/ 

Merge-nodges (N,N ’ ) / ^Updates the tree*/ 

Algorithm 2: Merge-nodges(A, A') 

/*Merges the subtree above N with the subtree above N' (after recursive calls)*/ 
Merge the node labels of A into those of A' 
if both A and A' had labels then 

1_ Add the set that is no longer first label to the duplicates list 

Remove A from its list of nodges with the same edge label 
foreach Ai child of A do 

/*Critical loop: to be fully described later*/ 
Let X be the edge label of Ai 
if A' has a child N[ of edge label x then 
I Merge-nodges(Ai, A() 
else 

|_ Insert Ai among the children of A' 



2.4 Complexity Analysis 

Because of a possible long merging of subtrees and because of the repeated dele- 
tions of elements and sets, we are rather interested in amortized complexity for 
the operations. The core of the analysis being the “critical loop” of algorithm]^ 
we introduce the maximum time T spent on each child during this loop. We 
now give a time bound for any sequence of element delete and set delete 
operations. 

Lemma 1. The time complexity of all operations element delete and set 
delete is in 0{dT). 

Idea of the proof: We have to show that the total number of children of the 
deleted nodges is in 0(d). To each child we are able to associate a couple {i^x) 
with X G Si, namely: x is the label of the parent, and Si labels a descendant. □ 
This lemma is useful to evaluate the complexity of our first two variants of 
the data structure: we simply compute T and check the initialization complexity. 
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The three variants described next depend on the data structure chosen for 
kinship: array, binary search tree or linked list. 

The time bounds correspond to the total of all operations element delete, 
set delete and equality whatever their order (we suppose that less than 0(d) 
operations equality are performed, each one being in 0(1)). 

Time 0{d + n), Space 0{d + kn). It means linear time, but with extra space 
needed (for the pruning problem this is O(n^) space). Here we choose arrays to 
code kinship so that T S 0(1), and the children of a nodge are linked, making a 
doubly linked list. More precisely, we use a somewhat redundant data structure 
for the kinship in the tree: 

— An array A of size k x n. 

— A linked list of the unused lines of A, thus of size less than k. 

— Each nodge N holds: 

• 4 more pointers, at the left sibling and right sibling which makes a linked 
list for siblings’ structure, and at the first and last child (we already 
have the parent; of course NIL pointer are used for inexistent kin). The 
children of a nodge will not be kept sorted. 

• An index a G |0, A:], such that a = 0 if N has one child at most. If N 
has at least two children, then the line of A consists of pointers at 
the children of N, that is, A[a, a;] points at the child of N of edge label 
X if any, or else contains NIL. 

The main point is that there are at most k leaves, so at most k nodes have degree 
more than 1. 

It is easy to delete or insert a given child (given by its edge label) in 0(1) time 
thanks to these structures. In particular, changing a nodge with several children 
into a nodge with a single child as well as the inverse are 0(1) operations. 

It remains to show that the extra data structures can be initialized in the 
required time and space; this is obvious if we are given an empty (filled with 
NIL) kn amount of space, in which to store A. 

If not, we cannot afford the time to empty all that space. However it is 
possible to resort to the indirection trick mm-- this simulates empty space 
and costs only linear time. 

Time 0{d + kn), Space 0{d + n). (for the pruning problem this means 
linear space but O(n^) time). Here we choose to link the children of a nodge 
with a simple linked list, where the children are kept sorted, the smaller on the 
left. Each nodge has a pointer at his right sibling and has other pointers at his 
first and last child. The initialization does not need extra time, as the children 
are appearing sorted. This is the easiest algorithm, but the amortized cost is 
somewhat trickier to prove. The analysis relies on the holes. 

Definition 1. An x-hole in the lexicographic tree is a couple (N,x) such that 
N is a nodge with edge label Xg, Xg < x, N has no son of edge label x, and N 
has a son of edge label xi, xi > x. 
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Removing a child given by a pointer is also easy in 0(1). Thus the global 
time complexity is 0{d + n) plus the time spent on the critical loop. We use the 
credits method to prove this time to be in 0{d+ kn): we keep the invariant that 
each hole is credited with 1 credit, and each nodge with 2 credits plus 1 credit 
per descendant leaf. 

Lemma 2. This makes at most 3d + kn credits initially. 

The critical loop is a simple sweep of the children of N and N' at the same time, 
inserting or merging as you advance. All credits correspond to the same amount 
of time. We check precisely what credit is used for each operation, so that the 
invariant will be maintained. As a result, the total time is bound by the initial 
number of credits (see iwwwn . 



Time and Space 0(n + d log n). (this is 0(m log n) for the pruning problem). 
It is easy to find a structure for which T G 
O(logn): for instance a binary search tree, 
with the edge label as a key. Each nodge sim- 
ply needs two more pointers, left and right 
“sub-child” . 

Each child adds an O(logn) space, and that 
makes a global extra space 0(dlog n). Remov- 
ing a child or inserting one with possible merg- 
ing is easy in time O(logn). 




log n 



3 Application to Graph Problems 



3.1 Pruning a Graph 

Let G be a graph with n vertices and m edges. The input is assumed to consist 
of the adjacency lists, it is of size 0{n + m). To each v G |l,n], we associate 
two sets, S 2 V -1 = N{v) and S^v = Afc(u). We apply to our sets (S'i)ig|i, 2 n] the 
operations defined earlier. 

Deleting a vertex v from the current graph is equivalent to removing S' 211 - 1 , 
S' 21 ,, and every edge with label v, from the tree. This is exactly two set delete 
operations, and one element delete operation. 

An equality operation is like scanning the graph for a couple of twins, so we 
simply lack the data structures necessary to detect pendant vertices. We simply 
need to maintain the graph structure with the degrees of the vertices. This can 
be done in linear time. 

Algorithm 0 computes the “pruned graph”. We suppose that the equality 
operation returns (0, 0) if there are no equal sets. 

So we can use the results of section 2 with d = Am + 2n and k = n. 



Theorem 1. Given a graph with n vertices and m edges, we can compute its 



pruned graph in the following bounds: 



Time bound 


n + m 


m log n 


V? 


Space bound 


0 

n 


m log n 


n + m 
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Algorithm 3: Pruning a graph 
Initialize the data structures 

repeat 

Let (i,j) be the output of an equality operation 
if i 7 ^ 0 then v < — 

else V < — a pendant vertex or 0 (No twins) 

if u 7 ^ 0 then 

Perform a set delete operation on S 2 V -1 
Perform a set delete operation on S 2 v 
Perform an element delete operation on v 
Update the degree data structure 

until u = 0 



Remark 1. The order chosen on the vertices may be important: for example, if 
this order happens to be a reversed pruning order, we might be deleting only 
leaf nodges, which would make the third algorithm linear in time too. We will 
thus be interested in finding a good order on the graph. 



3.2 Special Cases with Linear Space and Time 

We will analyze again the complexity of the algorithm linear in space (the third 
variant) in the special case when we are not dealing with both true and false 
twins. We associate to each vertex v only one set S'„: N{v) if we just remove 
true twins, Afc{v) if we just remove false twins. To ensure linear time, we will 
need two slight modifications: 

— Use of a special order on the vertices: a LexBFS order. 

— When operation equality supplies us with (i, j) 7 ^ (0, 0), instead of deleting 
the vertex corresponding to i, we will delete the vertex corresponding to 
max(i,j). Each such vertex is guaranteed to have a smaller twin in the 
graph. 

LexBFS means “lexicographic breadth-first search” |RTL76j . This is a par- 
ticular breadth- first search, where neighbours to already chosen vertices are 
favoured. The vertices are numbered, beginning at 1. When a vertex is picked as 
next vertex, it marks its unnumbered neighbours with its number (see Fig. EJ. 
Lexicographic order is used to compare two marks, and the next vertex is chosen 
with lowest mark. See also |HM PV97] for a simple algorithm (based on partition 
refinement, not on marks) to generate a LexBFS order. 

Precomputing a LexBFS order on the graph is interesting for two reasons. 
First, it gives strong properties to the lexicographic tree. The main one concerns 
the removal of twins: for the element deletion operation, merging two nodges 
can be done in 0(1). And all the other operations also have good complexities 
(see [IW W W| 1 . Secondly, after the removal of a pendant vertex or a vertex that 
has a smaller twin, we still have a LexBFS order on the graph. 
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{4,7) {1,7) 




Fig. 5. A LexBFS ordering with the marie of each vertex 



Theorem 2. — The modified algorithm prunes from a graph its pendant ver- 

tices and true twins in linear time and space. 

— The modified algorithm prunes from a graph its pendant vertices and false 
twins in linear time and space. 

So we have linear time instead of Ofn?) time. 

Example: Bipartite Graphs. In such graphs, two vertices are false twins iff 
they are pendant to each other (an isolated edge). Thus we can use our algo- 
rithm (deleting pendant and true twin vertices) for an easy linear recognition of 
distance-hereditary bipartite graphs. 

4 Another Application: Complete Bipartite Cover 

The “covering by complete bipartite subgraphs” problem is to find how many 
complete bipartite subgraphs we need (at the minimum) to cover every edge of 
a given bipartite graph. This problem is AfT^-hard in the general case (Problem 
GT18 in Ennj); however |Miil96j showed that the complexity falls to V for 
some particular graph classes, like the distance-hereditary graphs. The algorithm 
solving this particular case is not studied further than being polynomial. The 
base step is to find a “bisimplicial” edge, which leads to a time complexity in 
0{nm) on a bipartite graph with n vertices and m edges. 

However, it is possible to find an algorithm that runs in linear time and space. 
We will again prune the input graph. 

If we are given an optimal complete bipartite cover on the pruned graph, it 
is easy to find an optimal one on the original graph. Suppose we add a vertex 
to a bipartite graph with an optimal cover: 

Adding m as a true twin of v. every complete bipartite graph in which v 
occurs is changed by duplicating v into u. This is still a complete bipartite graph 
and we still cover all the edges. 

Adding u as a pendant vertex: we cannot extend an existing complete 
bipartite graph so that it includes the new edge ending by u, for that would 
require u to have another neighbor. Thus we have to add a new graph in the 
cover, for instance the star centered in v, the unique neighbor of u. 

Thus the number sought is simply increased by one for every “pendant op- 
eration” . 
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Computing the number of “pendant operations” during the pruning gives 
a simple algorithm to solve the problem on the class of bipartite distance- 
hereditary graphs. 

5 Conclusion 

The lexicographic tree is an interesting representation of a graph, and gives 
good results for the “pruned graph” problem. Since we can compute it in linear 
time and spac43 from adjacency lists, this tree is an interesting tool for graph 
problems. 

Besides twin vertices, we are convinced that there must be other properties 
of graphs that can be read on this data structure, knowing that special orders 
on vertices and special classes of graphs have a strong influence on the structure 
of the tree. 

As far as we know, it is still an open problem to compute the pruned graph 
in both linear time and space, in the general case. 
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Abstract. A macro tree transduction is MSO definable if and only if 
it is of linear size increase. Furthermore, it is decidable for a macro tree 
transduction whether or not it is MSO definable. 



1 Introduction 

Macro tree transducers (MTTs) are a well-known model of syntax-directed se- 
mantics that combines top-down tree transducers and context-free tree gram- 
mars (see, e.g., [EV85, CF82, FV98]). Their (tree-to-tree) transductions form a 
large class, containing the translations of attribute grammars. More recently, tree 
transductions that can be specified in monadic second-order (MSO) logic have 
been considered (see, e.g., [Cou94, BE98, EM99, KS94]). It is shown in [BE98[ 
that these MSO definable tree transductions can be computed by (a special type 
of) attribute grammars. Thus, as stated in [EM99| . every MSO definable tree 
transduction can be realized by an MTT. However, not every macro tree trans- 
duction is MSO definable. We have considered the question, which macro tree 
transductions are MSO definable? 

There are different ways of answering this question. As shown in [EM99; 
Section 7], every MSO definable tree transduction can be realized in particular 
by a ‘finite copying’ MTT, and vice versa, if an MTT M is finite copying, then 
its transduction is MSO definable. In this paper we prove a characterization 
in terms of a property which is independent of M: M’s transduction is MSO 
definable if and only if it is of linear size increase, i.e., the size of the output 
tree is linearly bounded by the size of the input tree. Moreover we prove that 
it is decidable whether M’s transduction is of linear size increase, and hence 
whether it is MSO definable. Note that the MSO definable tree transductions 
have several nice features that the macro tree transductions do not possess: by 
definition they can be specified in MSO logic, they can be computed in linear 
time |HE98|. and they are closed under composition nsnni. 

The idea for our characterization stems from FuTIj : there it is shown that a 
generalized syntax-directed translation (gsdt) scheme can be realized by a tree- 
walking transducer if and only if it is of linear size increase. Since gsdt schemes 
are a variation of top-down tree transducers, and tree- walking transducers are 

* This work was supported by the EC TMR Network GETGRATS. 



H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 542-|^2^ 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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closely related to finite copying top-down tree transducers |Eh!,S80| . our result 
can be viewed as a generalization of the result of Enm, from top-down tree 
transducers to macro tree transducers. Moreover, we obtain a characterization of 
the MSO definable top-down tree transductions that depends on the transducer: 
they are exactly the transductions that can be realized by finite copying top- 
down tree transducers (but to be precise, this only holds if the transducers have 
regular look-ahead, see, e.g., [GS97; Section 18]). 

Note that very often membership in a subclass is undecidable (such as regu- 
larity of a context-free language). In cases of decidability there is often a char- 
acterization of the subclass that is independent of the device that defines the 
whole class, analogous to our linear size increase characterization (as an exam- 
ple, in ICou95l it is shown that a vertex replacement graph language can be 
generated by a hyperedge replacement graph grammar if and only if the number 
of edges of its graphs is linearly bounded by the number of nodes) . 

Structure of the paper: In Section 0 we recall MTTs (with regular look- 
ahead); they are total deterministic. In Section 0 the two finite copying (fc) 
properties for MTTs are defined (as in ll!lM99l 1 and their decidability is proved; 
furthermore two pumping lemmas for non-fc MTTs are stated. Based on these 
pumping lemmas and two normal forms it is shown in Section 0 that non-fc 
MTTs in normal form are not of linear size increase. From [I]M99j we know 
that fc MTTs realize precisely the MSO definable tree transductions, which are 
obviously of linear size increase. Together this yields our characterization in 
Section 0 where the main results are given. The proofs are just sketched; details 
can be found in the preliminary version of fFMj . 



2 Trees, Tree Automata, and Tree Transductions 

We assume the reader to be familiar with trees, tree automata, and tree trans- 
ductions (see, e.g., IOS97I '). For m > 0 let [m] = {1, . . . ,m}. A set S together 
with a mapping rank: A — > N is called a ranked set. For A: > 0, is the set 
{cr G A I rank((r) = A:}; we also write to denote that rank((r) = k. For a set 
A, {S,A) is the ranked set {(cr, a) | cr G A, a G A} with rank((cr, a)) = rank(cr). 
The set of all trees over S is denoted T^:- For a set A, Ts(A) is the set of all 
trees over S \J A, where all elements in A have rank zero. The size of a tree s, 
denoted size(s), is the number of its nodes. A node of s is denoted by its Dewey 
notation in N* (e.g., the 6-labeled node in a{a,j{b)) is denoted by 21). The set 
of all nodes of s is denoted by l^(s). For u G l^(s), the subtree of s rooted at u 
is denoted s/u, and for s' G Tz: the tree obtained by replacing s/u in s by s' is 
denoted s[m <— s']. For a G A, ^a{s) denotes the number of occurrences of tr in s 
and for ACE, ^a{s) = I ^ We fix the set X of input variables 

xi,X 2 , ■ . . and the set Y of parameters yi, j/ 2 , • ■ • • For A: > 0, Xk = {x\, . . . , Xk} 
and Yfc = {yi, . . .,yk}- 

A finite tree automaton is a tuple (P, S, h), where P is a finite set of states, 
A is a ranked alphabet, and h is a collection of mappings '■ ^ P, for 
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a £ We also use h to denote its (usual) extension to a mapping from 

to P. For p £ P the set {s G Tj; | h{s) = p} = h~^{p) is denoted by Lp. 

A tree transduction is a mapping r: Tx: — > Ta, for ranked alphabets S and 
A. The tree transduction r is MSO definable if there exist a finite set C and 
MSO(A)-formulas Vc{x), ips,cix), and Xi,c,d{x,y), with c,d £ C, S £ A, and 
i > I, such that for every s £ Tx, y(r(s)) = {(c, x) €(7x14(3)15^ Vc{x)}, 
node (c,x) has label (5 iff s ^ tps,c{x), and (d,y) is the i-th child of (c,x) iff 
s 1= Xi,c,d{x, y)- An MSO(A)-formula is a formula of monadic second-order logic 
that uses atomic formulas labo-(a:) and childi(x, y), with a £ E and z > 1, to 
express that x has label a and y is the z-th child of x, respectively. The class 
of all MSO definable tree transductions is denoted MSOTT. For examples and 
more details, see, e.g., [Cou94, BE98]. 

A tree transduction r is of linear size increase (for short, Isi), if there is a 
c > 1 such that, for every s £ Tx, size(T(s)) < c • size(s). The class of all tree 
transductions of linear size increase is denoted by LSI. Note that every MSO 
definable tree transduction r is Isi (with constant c = |0|), i.e., MSOTT C LSI. 



3 Macro Tree Transducers 

A macro tree transducer is a syntax-directed translation device in which the 
translation of an input tree may depend on its context. The context informa- 
tion is handled by parameters. We will consider total deterministic MTTs only. 
For technical reasons we add regular look-ahead to MTTs (recall from [EV85; 
Theorem 4.21] that this does not change the class of transductions). 

A macro tree transducer with regular look-ahead (for short, MTT^) is a tuple 
M = (Q, P, E, A, qo, R, h), where Q is a ranked alphabet of states, E and A are 
ranked alphabets of input and output symbols, respectively, qo £ is the initial 
state, {P, E, h) is a finite tree automaton, called the look-ahead automaton, and 
i? is a finite set of rules. For every q £ a £ E^^\ and pi,. . . ,pk £ P with 

m,k > 0 there is exactly one rule of the form 



{q,a{xi,...,xk)){yi,...,ym) ^ C {pi,...,Pk) (*) 

in R, where C G T(^Q,x^,)uA{Ym). 

A rule r of the form (*) is called the (g, a,{p\, . . . , pfe))-rule and its right-hand 
side C is denoted by rhs(r) or by rhsM(9j <x,{p\, . . . ,Pk))', it is also called a g-rule. 
A top-down tree transducer with regular look-ahead (for short, T^) is an MTT^ 
all states of which are of rank zero. If the look-ahead automaton is trivial, i.e., 
P = {p} and ha{p, ■ . ■ ,p) = p for all a £ E, then M is called a macro tree 
transducer (for short, MTT) and if M is a T^, then M is called a top-down tree 
transducer. In such cases we omit the look-ahead automaton and simply denote 
M by {Q, E, A, qo, R); we also omit the look-ahead part {pi, . . . ,pk) in rule (*). 
If, for every g-rule r with q £ and every j £ [mj: (rhs(r)) < I, then M 

is linear in the parameters (for short, linp). We also use ‘linp’ as subscript for 
the corresponding transducers and classes of transductions. 
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The rules of M are used as term rewriting rules in the usual way with the 
additional restriction that the rule (*) can be applied at a subtree ct(si, . . . , Sk) 
only if, for every i G [k], Si G Lp.. The derivation relation of M (on T(^q^Xs)vja) 
is denoted by =^m- The transduction realized by M, denoted tm, is the total 
function {(s,t) G Tx; x T/^ \ {qo,s) t}. The class of all transductions which 
can be realized by MTT^s (MTTs, T^s) is denoted by MTT^ {MTT, T^). 

Let s G Tx and u G V(s). For si = s/u, tm(s) can be obtained by the 
derivation (qo,s) . . . {q' , si) . . . {q" , si) ■ ■ ■ tm(s), in which first the 

part of s outside of si is translated and then si is translated. The intermediate 
sentential form ^ = . . . (g', si) . . . (g", si) . . . shows in which states the subtree 
Si of s at M is processed. We will now extend =>m in such a way that ^ can 
be generated as ‘final’ tree. Moreover, we allow the translation to start with an 
arbitrary state instead of go- For g G and s' G Tx{P), Mq(s') is defined 
to be the (unique) tree t in T^g,p)uzi(^m) such that (g, s')(?/i, . . . , t, 

where =^m is extended to Ti^Q T^(p))uA{ym) in the obvious way, extending the 
look-ahead automaton with hp = p for every p G P. Now we get {qo,s[u <— 
p]) . . . (g',p) . . . (q",p) ■ ■ ■ = Mqg(s[u ^ p\), which is ^ with si replaced by 

h{si) = p. Note that in particular, Mqg{s) = tm(s). 

We say that (g,p) € {Q,P) is reachable, if there are s G Tx and u G M(s) 
such that (g,p) occurs in Mqg{s[u ^ p]). 

Assumptions: For an MTT^ M we assume from now on that (i) go does not 
occur in any right-hand side of a rule of M, (ii) if (g,p) is not reachable, then 
there is a (5 € such that (g, s)(pi, . . . , p^) 5{yi,...,ym) for every 

s G Lp, (iii) M has no “erasing rules”, i.e., rules with right-hand side pi, and 
(iv) M is nondeleting, i.e., for every g-rule r with g G 3-nd every j G 

[m]: ^y^(j:hs{r)) > 1. The proof that (iii) and (iv) may be assumed without 
loss of generality can be done by the use of regular look-ahead (see [EM99; 
Lemma 7.11]). 

4 Finite Copying 

A rule of an MTT^ M is copying, if its right-hand side is (i) nonlinear in the 
input variables or (ii) nonlinear in the parameters. We now want to put a bound 
on both of these ways of copying. For (ii) this is simple: M is finite copying in 
the parameters (for short, fcp), if for every g G s G Tx, and j G [m], 

ffy.{Mq{s)) < k for a fixed A: > 0. For (i), we generalize the notion of finite 
copying from top-down tree transducers (cf. [ERS80, AU71]) to MTT^s. The 
state sequence of s at node u, denoted by stSM(s,u), contains all states which 
process the subtree s/u. Formally, stSM{s,u) = T:{Mqq{s[u ^ p])), where p = 
h{s/u) and tt changes every (g,p) into g and deletes everything else. If for every 
s GTx and u G F(s), |stSM(s,u)| < fc for a fixed A: > 0, then M is called finite 
eopying in the input (for short, fd). We also use ‘fd’ and ‘fcp’ as subscripts 
for the corresponding transducers and classes of transductions. The following 
proposition follows from Lemmas 6.3 and 6.7, and Theorem 7.1 of [EM99| (where 
‘surp’ is used instead of ‘linp’). 
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Proposition 1. MTTg^ = and = MSOTT. 

For technical reasons we introduce another way of bounding the copying of in- 
put variables: associate with every MTT^ M the top(M) = {Q , P, E , 

,h) by changing rule (*) into rule {q,a{xi, . . . ,Xk)) C {Ph---iPk), 
where C' = a{{qi,Xi^),a{{q 2 ,x^^),. . .a{{qn,x^^),e ) . . .)) if , {qn,XiJ 

are all elements of {Q, X^) that occur in C- Then M is globally fci (for short, gfci), 
if top(M) is fci. We also use ‘gfci’ as subscript for the corresponding transducers 
and classes of transductions. Intuitively, the top(M) is carrying out the pure 
state behavior of M (in particular the copying of input variables) without the 
parameter behavior of M. Note that does not equal (cf. 

Example [ns, but that jj^^p does! The following lemma shows the latter, 

by relating the number of occurrences of a state in stSM(s, u) and stStop(M)(s, u). 

Lemma 2. Let M = {Q, P, E, A,qo, R,h) be an MTT^. For every q,q' G Q, 
s G Ts, u G E(s), and p € P: 

#<g'.p)(M,(s[u ^p])) > #(,/,p)(top(M),(s[ii ^p])), 

and they are equal if M is linp. On the other hand, for every path tt in the 
tree Mq{s[u *— p]), the number of (g',p)-labeled nodes on tt is < the number of 
(g',p)-labeled nodes of top{M)q{s[u ^ p]). 

Proof. By Assumption (iv), M is nondeleting. If an actual parameter ^ is copied 
in a derivation step of M, then the states that occur in ^ are copied too, and 
so the number of occurrences of q' can grow more than in the corresponding 
(noncopying) step of top(M). Considering a path tt only, there is no copying 
of states; thus, the application of a rule r of M increases the number of occur- 
rences of q' on TT at most by (rhs(r)) which equals x) (rhs(r')) for the 

corresponding rule r' of top(M). □ 

Note that Lemma |3 implies that a linp MTT^ (and in particular a T^) is gfci 
iff it is fci. Let us now prove the decidability of the fci and fcp properties. 

Lemma 3. For an MTT^ M it is decidable (i) whether or not M is fci and 
(ii) whether or not M is fcp. 

Proof, (i) It is straightforward to construct an MTT^ M' such that tm'{s[u ^ 
p]) = Mqg{s[u <— p]), and to construct an MTT M” which translates every tree t 
in T(^q,p)uA into a monadic tree 9i(g2(- ■ • 9«(e))), where (gi,pi), . . . , {qn,Pn) are 
all elements of {Q, P) that occur in t. Thus, for t = tm'{s[u ^ p]), TM"{t) equals 
stSM(s,u), seen as a monadic tree. Hence, for K = {s € Ts(P) \ ffp{s) = 1}, 
M is fci iff tm"{tm'{K)) is finite. This is decidable, because finiteness of the 
range of a composition of MTT^s, restricted to a regular tree language K, is 
decidable |DE98) . 

(ii) It is straightforward to construct an MTT^ M' with input alphabet E U 
I q G Q} and output alphabet A U Ym (for an appropriate m) such that 
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tm'(<z(s)) = Mg{s) for every q € Q and s G T^, and to construct an MTT^ M” 
such that for every t G size(rM"(t)) = 1 + (^) I J ^ [”^]}- Hence, 

for K = {^(s) \ q G Q,s G Tjj}, M is fcp iff tm"{tm'{K)) is finite. As above this 



Pumping Lemmas We now present two pumping lemmas for non-fci T^s 
and for non- fcp MTT^j^jS, respectively. They are the core of the proof, in Sec- 
tion El that linear size increase implies gfci and fcp. The first lemma is similar 
to Lemma 4.2 of FhtT] . We use the following notation (to “pump” a tree) . For 
s G Ts, u G P(s), p G P, and s' G Ts{P), let s[u <— p] • s' denote s[tt ^ s']. 
Thus, (s[u ^ p])^ = (s[u ^ p]) • (s[m <— pj) = s[u <— s[u <— p]j. 

Lemma 4. Let M = (Q, P, Lf, A, qo^ R, h) be a T^. If M is not fci, then there 
are (71,92 G Q, sq G uo,uqUi G P(sq), and p G P such that the following 
four conditions hold, with si = sq/uq. 

(1) (91, p) occurs in Mgg{so[uo ^ p]), 

(2) (qi,p) and (92, p) occur at distinct nodes of Mq^{si[ui <— pj), 

( 3 ) (92, p) occurs in Mq^{si[ui ^pj), and 

( 4 ) p = h{si) = h(si/ui). 

Proof. Let t G and v G V{t). Then for every ancestor u of v let csts(u, u) 
denote stSM(t,u) restricted to states 9 which contribute to stsM(L'c), be., for 
which f = Mq{t/u[v' <— h{t/v)]) ^ Ta{Y) with v = uv' . For any state r that 
occurs in f we write 9 ~^u,v r. Note that 9 ~^u,v q' ~^v,w q" implies 9 ~^u,w q" ■ 
Since M is not fci, there are arbitrary long state sequences; in particular, for p the 



Fig. 1 . a. tree t with contributing states b. ( 2 ) and ( 3 ) for 91 = q'^ and 92 = 94 

maximal number of occurrences of Xi in a right-hand side of the rules of M, and 
every n > 1 , there are t G Ts and Vn+i G V{t) such that |stSM(t, 'c„+i)| > p". 



is decidable by |DE 98 j . 



□ 
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This means that (on the path to Vn+i in t) at least n times rules that copy 
the corresponding input variable have been applied. Thus, there are different 
ancestors v\, . . . ,Vn of Vn+i and states ri, . . . , r„, r„+i, , r!^, such that, 

for every i G [n+ 1], and r' occur in csts(ui, (at different positions) and, 

for every i G [n], ri and ~^vi,vi+i (see Fig. where the 

nodes Vi in t with their csts’s are shown, and arrows mean 

Take n = \Q\ ■ |P| • 21*51 and let t, Vi, and rt, r\ be as above (for i G [n + 1]). 
Clearly this means that there are distinct indices *, j G [n+ 1] such that Vi = Vj, 
p = h{t/vi) = h(t/vj), and the same states appear in csts(ui, and in 

csts(uj, u„+i). Now let q[ = ri and q '2 G Q with ^vi+i,vj q '2 W 2 exists 
because r[_^_i contributes to stsM(t,Vn+i))- Then q[ ^vi,vj q'l and ~^vi,vj 
which shows (1), (2), and (4), for sq = t, uq = Vi, uqUi = vj, qi = q[, and 
q 2 = q' 2 - To realize (3), we will pump the tree ti[vj ^ p] in t, where ti = t/vi 
and Vj = ViVj. Since the same states appear in csts(ui, and csts(uj, u„-|_i), 
there is a sequence q '2 ^vi,vj q '3 ^vi,vj ■ ■ ■ q'm l'm-u with m < \Q\ 

and 0 < 1 / < m. Let d > m — {v + 1) be a multiple of ^ + 1. For sq = {t[vi <— 
P\) • ^ P\Y • ih/v'j), qi = q[, 92 = q'd+i, and w = v^{v'jY: qi ^vi,w qi, 

qi ~^vi,w 92, and 92 ~^vi,w 92 in so, which shows (3) for uq = Vi and ui = {v'j^ 
and thus the lemma (because the pumping preserves (1),(2), and (4)). Figure^ 
outlines the choice of 92 for m = 5 and v = 2 (thus d = 3 and 92 = 94). □ 

Lemma 5 . Let M = {Q, P, S, Z\, 90, R, h) be an MTT^f^,j. If M is not fcp, then 
there are m > 1, 9 G j G [m], s G Ts, u G l^(s), and p G P such that 

(1 ) ^pD) ^ 2, 

(2) Mq{s[u ^ p]) has a subtree {q,p){^i, ■ ■ ■ ,Cm) such that > 1, and 

(3) p = h{s) = h{s/u). 

Proof. Let (3 be an input copying bound of top(M), 77 the maximal height of 
a right-hand side of M , and k the maximal number of occurrences of one par- 
ticular parameter in a right-hand side of M. For t G Ts, u an ancestor of a 
V G V(t), q G M G M, r G v G [to'], define (9,/r) ^u,v (r,v) 

if, for ^quv = Mq(t/u[v' ^ p]) with v = uv' and p = h{t/v): ffy^{(,quv) > 2 
and fquv has a subtree (r,p)(^i, . . . , such that #y^{£,u) > 1- Note that 
(9,74) ^u,v {q' ^v,w (9",/^") implies (9,/i) ^u,w (q",p"). 

Claim: For every > 1, u G V{t), q G and p G [to], if ffy^{Mq{t/u)) > 

then there exist a descendant u of u, a state r G \ and a jz G [to'] 
such that (q,p) ~^u,v (f,^) and ffy^(Mr(t/v)) > N. 

Proof. Let w be a lowest descendant of u such that ffy^ifquw) = 1- Then 
at least one of its children satisfies the requirements. In fact, assume to the 
contrary that ffy^{Mr{t/v)) < N for all (r, tz) and every child v of w. Consider 
the result f of applying a rule of M to each occurrence of {r',t/w), r' G Q, in 
the sentential form corresponding to ^quw Note that C Mq{t/u). It follows 
from Assumption (ii) and the second part of Lemma 2 that there are at most 
(3 occurrences of states on a path of ^quw Hence ffy^iC) < , and at most f3p 

states occur on each path of C,. Thus, ^y^{Mq{t/u)) < • a contradiction. 
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From the fact that M is not fcp and this claim it follows that, for every n > 1, 
there are t € T^, different ancestors v\,. . . ,Vn of a node Vn+i of t, and state- 
parameter pairs (gi,/Ti), . . . , such that Hi) ^vi,vi+i (qi+i,Ht+i) 

for every i G [n]. Take n = \Q\ ■ rfi ■ \P\ where m is the maximal rank of a state 
of M. Then there are distinct indices i,i' € [n -|- 1] such that q = qi = qi/, 
j = = /ij', and p = h{t/vi) = h{t/v^>). Then {q,j) (?, j)- Let s = t/vi 

and ViU = Vii (and so (3) holds). Then, in s, (q,j) ~^v,u (q,j) where v is the root 
of s, which means that (1) and (2) hold (with ^ = ^qvu)- CH 

5 MTTs of Linear Size Increase 

In the remainder of the paper we want to show that if the transduction of an 
MTT^ M is of linear size increase, then M is equivalent to an MTT^ which is 
fci and linp (this suffices to prove our characterization, by Proposition Q and the 
inclusion MSOTT C LSI). To do this, we first show how to obtain an equivalent 
MTT^ which is gfci (using Lemma By applying Lemma 0 we can show that 
it is fcp and thus linp, by (the first part of) Proposition 0 Finally we apply 
Lemma 0 again to get gfci and linp. 

Consider the MTT Mi = {Q, S, A,qo, R) with Q = {ql^\q^^\q'^^^}, S = 
A = {cr(^), 6^°^}, and R consisting of the following rules. 

(90,7(3^1)) ^ <^{{q,xi),{q',xi)) 

{q,'y{xi)) -> {q,xi) 

{l'.l{xi)) ^ (j{{q,xi),{q' ,xi)) 

(r, a) ^ a (for every r G Q and a G {a, 6}) 

Note that Mi is a top-down tree transducer; thus AIi is gfci iff it is fci (cf. 
Lemma|3) . Intuitively, Mi translates a monadic tree s of height n into a comb t of 
height n\ the leaves of t have the same label as the leaf of s. Thus, size(rMi (s)) < 
2 • size(s) for every s G Ts and so tmi is Isi. Clearly, Mi is not fci because 
stSMi (7”(a), m) = for n > 1 and u the leaf of 7”(a). The reason for this 
is that Ml generates many copies of g, but q generates only a finite number of 
different trees (viz. the trees a and b) . How can we change Mi into an equivalent 
MTT^ which is fci? The idea is to simply delete the state q and to determine by 
regular look-ahead the appropriate tree in {a, 5}. In this example we just need 
Lp = {7" (a) I n > 0} and Lpi = {7" (6) | n > 0} and then the go-rule of Mi is 
replaced by two go-rules with right-hand sides cr(a, (q',xi)) and a{b, (g',a::i)) for 
look-ahead p and p', respectively, and similarly for the g'-rule. 

In general, we require that every state of an MTT^ M, except possibly the 
initial one, generates infinitely many output trees (in T^{Y)). More precisely, 
for every p G P, M should generate infinitely many output trees for input trees 
from Lp. Formally, the MTT^ M = {Q, P, S, A, qo, R, h) is input proper (for 
short, i-proper), if for every q G Q and p G P such that g yf go and (q,p) is 
reachable, the set Out(g,p) = {Mg{s) \ s G Lp} is infinite. 

This notion was defined in FTTTT] for generalized syntax-directed translation 
schemes (which are a variant of top-down tree transducers) and was there called 
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‘reduced’. The construction in the proof of the following lemma is similar to the 
one of Lemma 5.5 of irrTTn except that we apply it repeatedly to obtain an 
i-proper MTT^ as opposed to their single application which is insufficient (also 
in their formalism, which means that their proof of the lemma is incorrect). 

Lemma 6. For every MTT^ M there is (effectively) an i-proper MTT^ M' 
equivalent to M. If M is linp, then so is M' . Also, if M is a T^, then so is M' . 

Proof. Let M = {Q, P, S, A,qo, R,h) be an MTT^. For each p G P, let Fp = 
{q G Q \ Out{q,p) is finite}. Note that Fp can be constructed effectively, because 
it is decidable whether Out(g,p) is finite. In fact, it is easy to construct an MTT^ 
Nq such that for every s G T^, tn^{s) = Mq{s) if s G Lp, and e otherwise. Then 
the range of Nq equals Out(g,p) U {e| and its finiteness is decidable by [DEM| 
(and if it is finite, it can be computed). 

The MTT^^ M' is constructed in such a way that, if (r, Xi) occurs in rhsM' (?, 
(T, (pi, . . . , Pfc)), then r ^ Fp^. Clearly, this implies i-properness of M' . We first 
construct the MTT^ tt{M) by simply deleting occurrences of (r, Xi) with r G Fp. 
and replacing them by the correct tree in Out(r,pi), which is determined by 
regular look-ahead. Due to the change of look-ahead automaton, an occurrence 
of {r,Xi) in the (g, tr, (pi, . . . ,pfc))-rule of M with r ^ Fp. might produce only 
finitely many trees for the new look-ahead states (pi,pi), which means that 
tt{M) is not i-proper yet. For this reason we have to iterate the application of tt 
until the sets Fp do not change anymore. This results in the desired MTT^ M'. 

For each p G P let <l>p be the set of all mappings (p : Fp ^ Ta{Y) such that 
there is an s G Lp with (p{q) = Mg{s) for every q G Fp. The (finite) set <Pp can 
be constructed effectively: for every possible mapping p which associates with 
every q G Fp a, tree in Out(g,p), p is in <I>p iS K = Lp (1 ,^(<^( 9 )) I 9 e Fp} 

is nonempty. This is decidable, because K is regular by [EV85; Theorem 7.4(i)]. 
Since each mapping in <Pp fixes an output tree for every state q G Fp, the map- 
pings in <Lp partition the set of input trees Lp. The sets in this (finite) partition 
are regular (viz. the sets K) and thus can be determined by regular look-ahead. 

The MTT^ 7 t(M) has look-ahead states {(p, p) | p G F, p G <Lp} with 
KUpi^'Li). ■■■APk.Pk)) = {K{pi,...,pk),ip) and p{q) = (q for every q G Fp, 
where Cg = i'hs^(M) (9> o', ((pi, pi), . . . , (pfe, Pfe))) is obtained from C in rule (*) 
by replacing every (r,Xi) by <Pi{r), for r G Fp^. □ 

We can now apply our first pumping lemma (Lemma 21 , to non-gfci MTT^s. 

Lemma 7. Let M be an i-proper MTT^. If tm G LSI, then M is gfci. 

Proof. Let M = {Q, S, A,qo, R, P, h) and let c > 0 be such that for every 
s G Ts, size(rM(s)) < c- size(s). Assume now that M is not gfci. This leads 
to a contradiction as follows. 

Since top(M) is not fci we can apply Lemma E| to it. Let qi,q 2 G Q, sq, si G 
Ts, uo,uqUi G F(so), and p G P such that (l)-(4) of Lemma 0 hold (with 
Si = sq/uo). By Lemma 0 (l)-(4) also hold for M. Note that since go does not 
occur in any right-hand side of M by Assumption (i), (2) implies that 52 9o- 
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Thus, since M is i-proper and {q2,p) is reachable (by ( 1 ) and ( 2 )), the set 
Mq^ (Lp) is infinite and hence contains arbitrarily large trees. Let S2 S Lp such 
that #/i(Mg2(s2)) > c • (co + Cl + 1), where cq = size(so[no ^ p\) ~ ^ Eind 
Cl = size(si [til <— p\) — 1 . We now pump the tree si [ui <— p] in the tree (so[uo <— 
p])*(si[ui ^ p])»S2- for * > 0, let ti = (so[uo ^ p])*(si[wi ^ p])®*S2- It follows 
from ( 2 )-( 4 ) that for every i > 0 , stSM{ti,uou\) contains at least i occurrences 
of q2- Thus (by nondeletion), size(rM(ti)) > * • #A{Mq^{s2)) which is greater 
than i ■ c ■ (co + Cl + 1), by the choice of 52- 

Now let i = C2 = size(s2). Since size(ti) = cq + ici + C2 this means that 
size(rM(ti)) > c-size(ti) because size(rM(ti)) > ic(co + ci + l) > c(co + fci + i) = 
c(cq + ici + C2) = c • size(ti) which contradicts the choice of c. □ 

Next we show how to get from gfci to fcp (for Isi MTT^s). Consider the MTT 
M2 = {Q,U,A,qo,R) with Q = ^ and A = 

. For 5 S {o",7} and a € let the following 

rules be in R. 

(go, S{xi,X2)) 5 {{q, Xi){S), {q, X2){S)) 

{q,S{xi,X2)){yi) S{{q,xi){yi), {q,X2){yi)) 

(go, a) ^ a{a) 

{q,a){yi) ^ a{yi) 

Intuitively, M2 moves the root symbol of the input tree to each of its leaves; e.g., 
for s = 17(7(0;, / 3 ), a) we get tm2{s) = o'(j(a(a), / 3 (a), a(o'))). Thus, tm2 is Isi 
(because size(TM2(s)) < 2 • size(s)). Clearly, M2 is not fcp, because #yi(Mq(s)) 
equals the number of leaves of s. This time, the reason is that M2 generates 
a lot of parameter occurrences which contain only finitely many trees (viz., a 
and 7). Again the idea is to eliminate them: An MTT^ M is parameter proper 
(for short, p-proper), if for every m > 1 , g G j G [to], and p G P, 

if (q,p) is reachable then Arg(g,g,p) is infinite, where Arg(g,j,p) is the set 
{^j I 3 s G Ts,u G C(s) : Mqq(s[u ^ p\) has a subtree (g,p)(6, ■ ■ • , Cm)}- An 
MTT^ is proper, if it is both i-proper and p-proper. 

Lemma 8. For every MTT^ M there is (effectively) a proper MTT^ prop(M) 
equivalent to M. If M is linp then so is prop(M). 

The idea of the (quite involved) proof is to start with an i-proper MTT^ and 
then to make it p-proper by a construction similar to the one in the proof of 
Lemma IBt but now not the look-ahead but the states of tt(M) are used to code 
information (about the content of ‘finite- valued’ parameters). 

We now apply our second pumping lemma (Lemma , to non-fcp MTTgf^jS. 

Lemma 9 . Let M be a p-proper MTTgf^,;. If tm G LSI, then M is fcp. 

Proof. Let M = (Q, S, A, go, R, P, h) and let c > 0 be such that for every s G Ts, 
size (tm(s)) < c • size(s). Assume that M is not fcp. 

Let TO > 1, g G Q^\ j G [to], s G T^, u G I^(s), andp G P such that (l)-(3) 
of Lemma El hold. Clearly, ( 1 ) implies that ffy.(Mq(s)) > 2 , and hence (q,p) is 
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reachable (by Assumptions (iv) and (ii), respectively). Thus, by p-properness, 
Arg(g,j,p) is infinite. Let ci = size(s) and let t be a tree in Arg(q,j,p) with 
size(t) > c • (ci + 1). Then there exist sq G Ts and mq G l^(so) such that 
Mqg{so[uo <— p\) has a subtree (9,p)(Ci> • ■ ■ ) Cm) with Q = t. Let cq = size(so)- 
We now pump the tree s[u <— p] in the tree (so[uo ^ p\) • (s[m <— p])- for * > 0, 
let ti = (sq[uo ^ p]) • (s[w ^ p]Y- It is straightforward to show, using (l)-(3) 
and the fact that M is nondeleting, that Mg{s[u ^ p]®) contains more than 
i occurrences of pj. Thus, since M has no erasing rules by Assumption (iii), 
size(rM(ii)) > size(t) • i > c(ci + l)i. 

Let i = Cq. Since size(ti) < co + Cii = (ci + l)i, this means that size{TM (U)) > 
c ■ size(ti), which contradicts the choice of c. □ 

6 Main Results 

From Lemmas Q and 0 we conclude that if M is a proper MTT^ and tm is Isi, 
then M is both gfci and fop. Unfortunately this is not our desired characterization 
yet. The class fcp *00 large: it contains transductions that are not Isi. 



Example 10. Consider the MTT M3 = (Q, E, A, qo, R) with Q — {ql^\q[^\q^\ 
S = A = and the following rules: 



{Qv 


.,a{xi,X2)) 


{qi,Xl){{q2,X2)) 


(for 


{0,2}) 


(fo 


,(^{xi,X2)){yi) - 


<^(yi,2/i,(*Mi)) 






(b 


a{xi,X2) 


cr((i,a;i), {i,X 2 )) 






(fo 




5{yi,yi,a) 








a) 


a 


(for 


r gQ- (gil) 



As it turns out, M3 is indeed gfci and fcp: corresponding input and param- 
eter copying bounds are 1 and 2, respectively. It is also proper. Take = 
cr{a, <j(a, . . . (j{a, a ) . . . )), where n is the number of cr’s. This tree is translated 
by M3 into the monadic tree {qi,a)'^{{q 2 ,a)) and then into the full binary tree 
of height n over and a, in which every 5 has an additional right-most leaf 
a. Thus, TM3 is not Isi: size(rM3 (sn)) = 3 • 2" — 2 and size(s„) = 2n+l. 

If M is fcp, we can apply Proposition 0 to get an equivalent MTT^ linp(M) 
that is linp (this is effective [EM99; Lemma 6.3]). To apply Lemma 0 again, 
we first have to make linp(M) proper again (by applying ‘prop’ to it), because 
its construction does not preserve properness. We are ready to prove our main 
theorem. 

Theorem 11. Let M be an MTT. Then the following statements are equivalent: 

(1) Tm is MSO definable. 

(2) Tm is of linear size increase. 

(3) prop(M) is fcp and prop(linp(prop(M))) is fci. 
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Proof. Since MSOTT C LSI, (1) ^ (2). To show (2) ^ (3), let tm be Isi. Then, 
by LemmasEJandini prop(M) is fcp and, by LemmaQagain, prop(linp(prop(M))) 
is gfci and hence fci (by Lemmas 0 and |2) . Finally, (3) (1): If prop(M) is fcp 

and prop(linp(prop(M))) = M' is fci then M' is fci and linp, by Lemma0 Thus 
Tm is MSO definable by Proposition 0 □ 

From Theorem im Lemma 0 and the effectivity of prop and linp we obtain the 
following decidability result. 

Theorem 12. It is decidable for an MTT M whether tm is MSO definable. 

From Theorem m and Proposition 0 we get our characterization of MSOTT 
(recall that MTT^ = MTT, cf. [EV85; Theorem 4.21]): 

Theorem 13. MSOTT = MTT n LSI. 

Finally, we obtain the characterization of MSO definable top-down tree trans- 
ductions (with regular look-ahead) discussed in the Introduction. 

Theorem 14. T^ n MSOTT = Tfg = n LSI. 

Proof. From Lemmas El and Ewe know that n LSI C T^^j, which equals 

by LemmaEl The inclusion Tf^^ C T^DMSOTT is immediate from Proposition^ 
and T^ n MSOTT CT^O LSI follows from the fact that MSOTT C LSI □ 

Open Problems It is not clear how MSO definability could be generalized in 
order to obtain the full class of macro tree transductions or maybe just the class 
of polynomial size increase (psi) macro tree transductions. Is psi decidable for 
MTTs? If so, what is the complexity? (cf. [IDreDQ] ). Finally, we do not know 
whether our result generalizes to nondeterministic MTTs, and for compositions 
of MTTs we would like to know: is LSI n |J„ MTT'^ = MSOTTl 
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Abstract. We prove an effective characterization of languages having 
dot-depth 3 / 2 . Let S3/2 denote this class, i.e., languages that can be 
written as finite unions of languages of the form U0L1U1L2U2 ■ ■ ■ LnUn, 
where Ui G A* and Li are languages of dot-depth one. Let F be a deter- 
ministic finite automaton accepting some language L. Resulting from a 
detailed study of the structure of B3/2, we identify a pattern P (cf. Fig.| 2 J 
such that L belongs to B3/2 if and only if F does not have pattern P in 
its transition graph. This yields an NL-algorithm for the membership 
problem for B3/2. 

Due to known relations between the dot-depth hierarchy and symbolic 
logic, the decidability of the class of languages definable by L'2-formulas 
of the logic FO[<, min, max, S, P] follows. We give an algebraic interpre- 
tation of our result. 



1 Introduction 

We contribute to the theory of finite automata and regular languages, with con- 
sequences in logic as well as in algebra. Particularly, we deal with starfree regu- 
lar languages. These are languages constructed from alphabet letters using only 
Boolean operations together with concatenation. Alternating these two kinds of 
operations in order to distinguish combinatorial and sequential aspects, leads to 
the definition of concatenation hierarchies that exhaust the class of starfree lan- 
guages. Prominent examples are the dot-depth hierarchy, first studied in EHH, 
and the Straubing-Therien hierarchy jStrSlL f TheSl I IStr85j . Both are known to 
be strict and closely related to each other EEHU- Most naturally arising 

questions concerning these hierarchies are of major interest in different research 
areas since close connections have been exposed, e.g., to finite model theory, 
theory of finite semigroups, complexity theory and others. 

Here we deal with the dot-depth hierarchy. Let A be some finite alphabet 
with \A\ > 2. For a class C of languages over A+ let POL(C) be its polynomial 
closure, i.e., the class of languages L C A+ that can be written as a finite union 
of languages U 0 L 1 U 1 L 2 U 2 • • ■ LnUn, where Ui G A*, Li G C and n > 0. Denote 
by BC(C) its Boolean closure, i.e., the closure of C under finite union, finite 
intersection and complementation. Then the dot -depth hierarchy can be defined 
as the following family of classes (notations are adopted from IPW97I L 
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1. So=def {0,A+}, 

2- Bn+i/2 =def POL(e„) for n > 0, and 
3. Bn+i =def BC(K„+i/ 2) for n > 0. 

For a language L C A~^ and a minimal n with L € Bn/2 we say that L has 
dot-depth n/2. One obtains the same hierarchy classes when setting ;S„+i/2 =def 
POL(cO;B„_i/ 2) with cO;B„_i/ 2 =def {L \ L ^ ^„-i/2 } fOlahHj. By definition, 
all Bn+i/2 are closed under union and it is known, that these classes are also 
closed under intersection mm. The question whether there exists an algorithm 
deciding ^L{F) G Bn/2 for n > 0 and deterministic finite automata F (dfa F", for 
short) is known as the dot-depth problem. Although many researchers believe 
the answer should be yes, some suspect the contrary. To our knowledge, only the 
classes Bq, B\/2 and B\ were known to be decidable |Kna,83llPW^ . Especially, 
the case of dot-depth 3/2 was mentioned open in |Pin9fillPW^ . This can be 
seen in contrast to the Straubing-Therien hierarchy, for which beside levels 0, 
1/2 and 1 also level 3/2 is known to be decidable |Arf91llFTW7) . Some partial 
results are known for level 2 of this hierarchy, e.g., its decidability in case of a 
2-letter alphabet [IStr88j . 

In this paper we prove an effective characterization of languages having dot- 
depth 3/2. With an automata-theoretic approach we study the class B'i/2 in 
detail. Fix some fc > 0. We look at a word w as a word over by taking 

together each fc -I- 1 consecutive letters and call this the fc-decomposition of 
IV. In this way we obtain classes Bs/2,k for which B3/2 = Uk>o ‘^3/2,k- The fc- 
decomposition approach was used before in several contexts, e.g., when relating 
the dot-depth hierarchy and the Straubing-Therien hierarchy EUHS], and for a 
levelwise analysis of dot-depth one languages Rt^ . 

We first look at the family of classes B3/2.k- With the help of a series of 
technical lemmas we prove a useful normalform representation for languages 
in 183/2, fe- Then we provide a combinatorial lemma which keeps control of the 
lengths of factors a;, u, y of some input w to a dfa such that w = xuy and the state 
reached after input x' u has a loop with label u for all x' . Iterated applications 
of this lemma are a basic tool in subsequent proofs. 

Let F be a dfa accepting some language L. We show that L belongs to 83/2, fe 
if and only if F does not have pattern in its transition graph (cf. Fig. 
This yields an NL-algorithm for the membership problem for 83/2, fc which looks 
for the non-existence of Pfe in F. Since we encounter for fc = 0 level 3/2 of 
the Straubing-Therien hierarchy, we provide as a by-product a self-contained 
reproof of the normalform and the decidability result for this class QArf91| and 
pPW97) use deep results from jHa,s8,8j and pSim9l)j . respectively). 

Our generalization to arbitrary fc enables us to identify a pattern P such 
that L belongs to 83/2 if and only if F does not have pattern P in its transition 
graph (cf. Fig. Ej). So we can affirmatively answer the decidability question for 
83/2 even in an efficient way: looking for the non-existence of P in F yields an 
NL-algorithm for the membership problem of this class. Moreover, the proof is 
such that an algorithm can be derived to determine the exact level fc of a given 
language inside 83/2. 
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We draw some consequences which are due to various relations of the classes 
of the dot-depth hierarchy to other fields of research. The connection to first- 
order logic goes back to Mvn\ . The dot-depth hierarchy is related to the first- 
order logic FO[<, min, max, (S', P] having unary relations for the alphabet sym- 
bols from A, the binary relation <, the successor (predecessor) function S (P, 
resp.), and constants min and max. Let Sn be the subclass of this logic which 
is defined by at most n — 1 quantifier alternations, starting with an existential 
quantifier. It has been proved in (see also EWna) that Sn~ 

formulas describe just the Bn- 1/2 languages and that the Boolean combinations 
of ifn-formulas describe just the Bn languages. Due to this characterization we 
can conclude the decidability of the class of languages definable by if2-formulas 
of the logic FO[<, min, max, S, P]. We also give an algebraic interpretation of our 
result and characterize the languages of dot-depth 3/2 by a condition on their 
ordered syntactic semigroups. Spoken in algebraic terms, this yields an effective 
characterization of the variety of finite ordered semigroups corresponding to the 
positive variety of languages, as which B 3/2 can be understood. 

If one looks at our result in combination with the known forbidden pattern 
characterization of B 1/2 from IF W 971 it is easy to discover regularities between 
both patterns. Continuing them leads to the definition of a pattern R„ for n > 1, 
which defines decidable subclasses of starfree languages in a forbidden pattern 
manner. We conclude this paper with some informal arguments supporting the 
possibility that = Bn- 1/2 holds also for all n > 3. 

A comprehensive treatment of the issues presented in this extended ab- 
stract is given in a self-contained way in , see http://www.informatik.uni- 

wu erzburg.de/ reports / tr. html. 



2 The Classes Bs/ 2 ,k 

Throughout the paper we consider languages as subsets of A+. Let k > 0. We 
denote by A-^ the set of words from A+ of length less or equal to k (similarly, 
^<fe, j\>k^ . . . ). It will be useful for us to look at w G A+ as a word over by 

taking together each k-\-l consecutive letters. We denote elements from as 

a, /?, 7 , . . . and subsets of as A, P, Let w = 0102 • • ■ ak+i G A+ for some 

I > 1. We call w =def (Pit P 2 , • ■ ■ , A) the ^-decomposition of w it Pi = ai ■ ■ ■ ai+k 
for 1 < i < 1. Intuitively, k indicates by how many letters from A consecutive Pi 
overlap. We set a{w) =def {Pi,P2, • • ■ , Pi}- Next we define languages that admit 
the same fc-decomposition with respect to given elements and subsets of 

Definition 1. Let k,n > 0 and ai, . . . , G Aq, . . . , Sn C . For 

every w G A+ we say w G (Sq, oi. Si , . . . , a„, Sn)k if and only if |w| > fc -|- 1, 
w = {Pi , . . . ,Pi) and there exist 0 = jo < ji < j2 < ■ ■ ■ < jn < jn-i-i = ^ + 1 such 
that 



(a) Pj. = ai for 1 <i <n and 

(b) Pj G Si for 0 < i < n and ji < j < ji+i- 
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If we write the expression (Hq, ai, Si, . . . , Sn)k we understand this as a 
syntactical object describing some language. We do not distinguish between this 
object and the language it stands for. So the language (ifo, oi, Si, . . . , an, Sn)k 
consists of those words w G , whose fc-decomposition starts with a number 

(possibly zero) of elements from Sq, then a\, followed by a number (possibly 
zero) of elements from Si, then «2 and so on. Note that in case k = 0 we 
deal with the usual concatenation, e.g., (^o)o = (^0,01,^1102,^2)0 = 

^001^102^2. For convenient notations we write {w\Sq, ai. Si, . . . ,a„, Sn\v)k 
instead of (wA* n A*v n {So,ai, Si, . . . , an, Sn)k) ■ 

Definition 2. Let k > 0 and m > 1. Then Bij^ni,k) the class of languages 
L C A~^ that are in the Boolean algebra generated by languages Li such that 
Li = D with D C Qj. 

L, = {w\A'^+\ ai,A^^+\ 02, . . . , om, \v)k 

where aj G A^^^ and w,v G A’^. 

The definition of these classes is motivated by a characterization of Bi in terms 
of the congruence with k,m > 0 introduced in |SmZ2|- With the next 
theorem we recall in our notations the fact that the classes Si refine Si. 

Theorem 1 ( [Slm72j i. Let L C A~^ . Then L G Bi if and only if there exist 
k >0 and m > 1 such that L G Si^(m,fc)- 

For an overview on hierarchies which result from fixing one or the other param- 
eter see [Eiiznj. Among others, a hierarchy in Si obtained by fixing k has been 
studied |Sim721 1 !ste85j . 

Definition 3 ( |Sim72j ). Let fc > 0. Then Sp^ =def Um>i '®i,(™,fc) • 

Proposition 1. Si = \Jk>o^^ k and S3/2 = Ufc>o POL(Si,fe). 

The first equation follows from Theorem [U while the latter is due to S3/2 = 
POL(Si) = POL(lJ^>g Si^fe) = Ufc>o PC)L(Si,fc), which gives rise to the starting 
point of our investigations. 

Definition 4. Let k >0. Then B 3 / 2 .k =def POL(Si,fc). 

By definition, languages in B^/ 2 ,k are finite unions of concatenations of words 
with languages from Si^fc, which are in turn Boolean combinations of the lan- 
guages Li from Definition O, a somewhat unstructured representation. We give 
the following normalform. 

Theorem 2. Let k > 0 and L C A'*". Then L G B^j 2 ^k */ and only if L can 
be written as a finite union of languages Li such that Li = D for D C A-^ or 
Li = (Ho, Oil, Hi, . . . , an, Sn)k where n>l, aj G and Sj C 
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3 Finding Automata Loops in Words 

A useful tool in our proofs is the fact that we can find factors in a word which 
lead to loops in a given dfa. It is important here to analyse the length needed 
to find such a factor, depending on the size of the dfa in question. For this end, 
we define a bounding function /C(n) as 

/C(n) =def (n + 

and prove the following rather technical lemma. Let 6 be the transition function 
of the given dfa F. Then every w G A* induces a total mapping S'^ : S ^ S on 
the set of states S with (5“'(s) =def S{s,w) for all s G S. 

Lemma 1. For every dfa F and for all vq, .. . , G there exist an m > 0 
and indices 0 = zq < A < • • • < * 2 m+i = n + 1 such that 

1. ij-\-i — ij < /C(|A|) for 0 < j < 2m and 

2. = (5“ for all u = Vi-Vi-+i ■ ■ ■ with 1 < / < 2m and j = 1(2). 

To give some intuition, this means for factors of length one, that for every 
w = vqVi ■ ■ -Vn with Vi G A there exist words wq, . . . , , Um G 

such that w = wqUiWi ■ ■ ■ UmWm and (5“’“’ = 5“’ for 1 < z < m. We use LemmaE 
for arbitrary factors Vi in the proof of Lemma El below. 

4 Forbidden Pattern Characterization of Ba/ 2 ,fc 

Let F" be a dfa accepting a language L. We want to show that L is in B 2 ,/ 2 ,k if and 
only if F does not have pattern (cf. Fig. CJ in its transition graph. As usual 



V V 
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z 
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Fig. 1. Pattern with u,w,z G A*,v G initial state sq, accepting 

state s'*', rejecting state s~ and a{vwv) C a{iPv) for fc-decompositions of words. 



in forbidden pattern proofs, there is on one hand an easier to prove implication 
(Theorem EJ and on the other hand a more difficult one (Theorem^. 

Theorem 3. Let fc > 0. If a dfa F has pattern Pfe, then L{F) ^ 
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Proof (Sketch). Suppose F has pattern and L{F) G By TheoremElwe 

have that L(F') is a finite union of languages Li such that Li = D for D C A-^ 
or Li = (Ifo) cki, ^ 1 ) • • ■ , OLn, Fn)).. Since we can pump up uz £ L(F) to arbitrary 
uv^ z £ L{F), we can determine a sufficiently large j, such that uv^z ^ and 
there is some I such that a{vv) £ Si. Because a{vwv) C a{vv) by pattern 
we can insert w without leaving L{F), a contradiction to pattern P^. □ 

The more complicated implication will be a consequence of Lemma |5| given 
below. There we derive from every x G L a, subset of L which contains x and 
which can be described by expressions of bounded size. In particular, we consider 
expressions E of the form 

Wo • {vi\Si\v[)f. • Wi • • • {Vn\Sn\v'^)f. ■ W„ 

where Wi G A~^ , ^ ■ We define the size of E as 

|wou>i---w„| and identify E with the language described by E. For a fixed 
A: > 0 let us denote the set of all such expressions by £k. To analyse the size of 
expressions in Lemma El we make the following definition, where variables a, /, n 
will be associated with the size of the alphabet A, the size of the dfa F and the 
cardinality of a{w) for a given word w, respectively. 

{ k : if n = 0 

2/-f + fc + l : ifn=l 

3/C(/) • (5/-^a^ + 1) • £(fc, a, /, n — 1) : otherwise 



Lemma 2. Let k > 0 and let F be a dfa which does not have pattern P^. For 
every x G A~^ there exists an expression E^ G £k of size < C{k, |A|, |F|, |a(ai)|) 
with X G Ex and for all x', x" G A* we have 

x'xx” G L{F) x'Exx” C L{F). 

In the following, the term ‘short’ (‘long’) means that the size of an expression 
can (can not, resp.) be bounded by a function in k, |^|, |F’| and |a(a?)| (we use 
‘small’ and ‘large’ for cardinalities). 

Proof (Sketch). We prove the lemma by induction on |a(a?)|. If |o(ir)| = 0 then 
X is short and we are done. If |a(ir)| = 1, it follows that x = for some letter 
a G A. Now it is easy to see that either x is short or a® • (a^| |a^)^ ■ aP 

provides the desired expression for suitable choices of small i and j. 

In the induction step we consider x G A~^ with |a(5;)| = n+I > 2. First 
of all we decompose x into factors Si (so-called ‘sectors’) with |a(s))| < n as 
follows. Determine the longest prefix si of x such that |o!(^)| < n. Now we start 
over with the remaining part of x and determine its longest prefix S 2 such that 
|ct('^)| ^ "o- and so on. Observe that we obtain a factorization of x into sectors 
Si such that a(sTs^) = a{x). Furthermore, the induction hypothesis provides 
us with short expressions for sectors constructed in this way. 
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Note that neither the length of sectors, nor their number must be small. The 
main task of the induction step is to replace the large number of sectors by a 
small number of terms (v\S\v')f. in such a way that (i) we do not leave L{F) if 
we started with x € L(F) and (ii) we obtain an expression where only a small 
number of sectors is left. 

If the number of sectors is already small, we can replace each sector s with 
the expression provided by the induction hypothesis and we are done. 

If the number of sectors is large, we combine them to pairs pi =def S2i-iS2i 
in order to have a{pi) = a{x). Now we apply Lemma^to these pairs and get a 
partitioning of the sequence of the pi with x = wqUiWi ■ ■ ■ UmWm such that (i) 
each partition Ui (wi, resp.) consists of a small number of pairs and (ii) every 
partition m leads to an u^-loop (i.e., = i 5 “’) with a{ui) = a{x). Now we 

assign to each Ui a tag representing the mapping fc-suffix 

of Ui. In a next step we want to find maximal non-intersecting factors between 
some Ui and Uj having the same tags (so-called ‘regions’). Consider the simple 
greedy algorithm which chooses repeatedly a largest factor between some Ui,Uj 
having the same tags, such that the region between ui and Uj does not intersect 
an existing region (it stops if this is not possible) . Since the number of different 
tags is small, it can be shown that this algorithm returns a small number of 
regions, such that the number of Ui and Wi in some gap between those regions 
is also small. It follows that the number of sectors in some gap between regions 
is bounded. Note that regions may contain a large number of sectors. 

We treat all regions of x from right to left. Consider a particular region be- 
tween Ui and Uj. Then we replace this region with the term T = Ui-(j>\a{x)\s)f.-Uj 
where p is the fc-prefix and s is the fc-suffix of the word WiUi+iWi+i ■ ■ ■ Uj-iWj-\. 
Now we have reached a situation where we make use of the fact that F does not 
have pattern P^. Using Lemma Q and the tags, it can be observed that Ui and 
any word from T lead to an u^-loop. It follows that if we leave L{F) with some 
word from T, then we would find pattern in F, a contradiction. We can con- 
tinue this substitution from right to left, region by region, without leaving L(F), 
since the tags left to the substitution position remain valid. In this way we obtain 
an expression E'^ G £k with x S E'^. Since the whole argument is independent 
of prefixes x' and suffixes x”, we can show x'xx" G L{F) x'E'^x” C L{F). 
Furthermore, the number of terms {v\E\v')j^ and number of remaining sectors 
in E'^ is small. If we apply the induction hypothesis to these sectors, we obtain 
the desired expression E^. This completes the induction. □ 



Theorem 4. Let k > 0. If a dfa F does not have pattern Pk, then L{F) G 

^3/2, k ■ 

Proof (Sketeh). Let F be a dfa which does not have pattern P^. By LemmaQwe 
find for every x G L{F) a short expression E^ C L{E) with x G E^. Since there 
is only a finite number of different expressions of £k having the same size, we can 
write L{F) as a finite union of expressions of £k- With the help of TheoremElit 
can be shown that languages of £k are in B^/2,k- C 
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Taking together Theorems 0 and E| we obtain the main result of this section. 

Theorem 5. Let k > 0 and let F be a dfa. Then L{F) G */ only if 

F does not have pattern . 

To see that this characterization is effective we provide an efficient algorithm 
to check the non-existence of pattern in the transition graph of a given 
dfa. In particular, we show that the occurrence of pattern P^ can be decided 
in nondeterministic, logarithmic space (NL) for a fixed fc > 0, which is a class 
closed under complementation. For this end we guess the states si, S 2 , s'*', s“ 
and check the existence of the words u,v,w,z applying the same technique that 
solves the graph accessibility problem. While guessing v and w we additionally 
store the last k letters which then enables us to determine all elements of a(vwv) 
and a(mJ). Since k and the size of the alphabet can be considered as constants, 
all this can be done in NL. 

Furthermore, Theorem 0 allows a concise proof of the strictness of the hier- 
archy of classes S 3 / 2 ,fc- We obtain B 2 j 2 ,k £ B 3 f 2 ,k+i for fc > 0 with help of the 
witnessing languages Lk — def (a'=+i6, : 0 < i < k + 1} , a’^+^b) 

5 Forbidden Pattern Characterization of B3/2 

We identify the pattern P given in Fig. which characterizes B^/ 2 - 



Fig. 2. Pattern P with initial state sq, accepting state s’*", rejecting state s , 
TO > 0, Ui,Wi G and u,z,Vi G A*. 




0 
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Theorem 6. Let F be a dfa accepting some language L. Then L S if 
only if F does not have pattern P. 

Proof (Sketch). We first show that the existence of pattern P implies the exis- 
tence of pattern P^ for every A: > 0. As witnessing words take u, z and v =def 

U^VoU^WiU^Viu'l ■ ■ ■ and W =def U^Wiu'l ■ ■ ■ 

This definition ensures that each element of the fc-decomposition of vwv overlaps 
at most two of the u^. It follows that a{vwv) C a{yv). 

Now suppose that F has pattern P^ for every fc > 0. In particular, we find a 
pattern for k =def 3/C(|F|) with |w| > k+1 (take vwv instead of w if necessary). 
By Lemma0we can write wasw = wqUiWi ■ ■ ■ uiwi with words Wj, Ui G 
such that (5“* = Since a{vwv) C a(vv) and \uiWiUi+i\ < fc we can find 

each factor UiWiUi+i in vv. This argument leads to pattern P. □ 

Since this proof establishes a bound on fc in the size of the automaton, we can 
also find an algorithm which determines the minimal fc such that L{F) G B 3 / 2 ,k 
for a given dfa F. 

As before in the case of the classes 63 / 2 ,^, we exploit now Theorem El and 
construct an efficient algorithm which solves the membership problem for Bs/ 2 - 
Looking for pattern P can also be done in NL since we may continuously check 
piecewise the occurrences of the respective subgraphs for WiUiViUi and WiUi. 
Note that we do not need to bound m since no bound is required for the length 
of paths in an NL-computation. 

Theorem 7. The membership problem for B 3/2 is in NL. 

6 Further Consequences 

Due to the various characterizations of the classes of the dot-depth hierarchy. 
Theorem Q has immediate consequences in other fields of research. The cor- 
respondence of the class of languages definable by A„-formulas of the logic 
FO[<, min, max, S', P] and Bn-1/2 from l'rho82l has already been mentioned in 
the introduction. Due to this characterization we have the following corollary. 

Corollary 1. Given a regular language L, it is decidable whether L is definable 
by a S 2 -formula of the logic FO[<, min, max, S, P] . 

An algebraic interpretation of our Theorem Elcan also be given. For an intro- 
duction to the algebraic theory of finite automata we refer to [Eisg. Let L be a 
regular language of A+ and let Pl = (A, S, (5, sq, S') be its unique minimal dfa. 
We define the syntactic semigroup of L via the transition semigroup of Pl, i.e., 
as St =def {S'" : w G A+} where the composition is defined as S" ■ 5" =def 5"" . 
By ids we denote the identity mapping on S. The syntactic semigroup St can be 
considered as an ordered syntactic semigroup Sf with order relation < by set- 
ting pL <ri\i and only if ar]l3{so) G S' implies apif3{so) G S' for a,P G St U {ids}. 
If /i is an element of a finite semigroup, the minimal idempotent power of pL is 
denoted as pS" . 
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Theorem 8. Let L C be a regular language and Sf he its ordered syn- 
tactic semigroup. Then L G ^83/2 if only if Sf satisfies all inequalities 
{Em : m > 0} for any choice of Ti, Pi and 7^ from Sf where 

Em =def /3“7/3“ < with 

7 =def T^llTfl2T2 ' ' ' 7m+lT“+i and 

r\ to /D LJ CJ /3 LO to /3 LJ UJ O UJ 

P =def Tq PoTq 71 PiT.^ 72T2 P 2 T 2 ' 1 Pm + 1 ’^777-1-1 ■ 

Due to the class B 3/2 can be understood as a positive H — variety of 

languages when varying the alphabet A (for the definition of the notion of pos- 
itive varieties we refer to lEwnn). An Eilenberg-like theorem was given for the 
case of positive varieties in [Hin , which states that positive H — varieties of lan- 
guages and varieties of finite ordered semigroups are in one-one correspondence 
via the operation of taking ordered syntactic semigroups. So the inequalities 
{Em : m > 0} from Theorem characterize the variety of finite ordered semi- 
groups corresponding to 83/2- It is known that this variety is equal to the Mal’cev 
product of the variety of finite semigroups corresponding to dot-depth one lan- 
guages with a certain other variety of finite ordered semigroups, as stated in 
Theorem 5.8 in |PWh7j . The benefit of our characterization is that it is effective 
as follows from Theorem O 

7 Conclusions 

It was conjectured in that the decidability questions for the Straubing- 

Therien hierarchy and the dot-depth hierarchy are related not only on levels n 
for integers n EUHa, but on all levels n/2. We confirm the latter with our work 
now also for n = 3, while the general case remains open. 

We see the contribution of our paper not only as a stand-alone result pro- 
viding the decidability of 83/2, but in what we can carry over to the general 
case. First we note that the nature of our proof is such that it bounds in a 
computable way (in terms of the automaton size and the alphabet size) the de- 
scriptional complexity of a language L, i.e., it bounds the length of an expression 
that witnesses that L is of dot-depth 3/2. Let us continue on an informal level. 

If we compare pattern P' from fPWWj characterizing B1/2 and our pat- 
tern P characterizing 83/2, we observe that subgraphs of type P' appear as 
UiViUi in pattern P. Now one can repeat inductively this formation procedure 
using pattern P in a more complicated pattern in the same way as P' ap- 
pears in P. This leads for n > 1 to the definition of patterns for which 
Ri=P' and R2=P holds (This can also be done for the Straubing-Therien hi- 
erarchy with the same formation procedure, but starting with the pattern that 
characterizes level 1/2 there). Moreover, looking for the existence of R„ can be 
effectively carried out with a recursive application of our algorithm for testing 
pattern P. 

Denote by the class of languages that can be accepted exactly by those 
dfa’s which do not have R^ in their transition graph. We believe that it is a 
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reasonable conjecture that Bn-i /2 = Cn holds for all n > 1. As we know now, 
this is true for n = 1 and n = 2. We can further support this by some partial 
results, left without proofs here. First we note that for all n > 1 it holds that 
is a subclass of starfree languages. Moreover, one can show with a generalization 
of Theorem^that contains Bn-i/ 2 - Finally, we see that C Cn+i using the 
languages from witnessing the strictness of the dot-depth hierarchy. 
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Borchert, Klaus W. Wagner and Thomas Wilke. 
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Abstract. This paper concerns the uniform random generation and the 
approximate counting of combinatorial structures admitting an ambigu- 
ous description. We propose a general framework to study the complexity 
of these problems and present some applications to specific classes of lan- 
guages. In particular, we give a uniform random generation algorithm for 
finitely ambiguous eontext-free languages of the same time complexity of 
the best known algorithm for the unambiguous case. Other applications 
include a polynomial time uniform random generator and approxima- 
tion scheme for the census function of (i) languages recognized in poly- 
nomial time by one-way nondeterministic auxiliary pushdown automata 
of polynomial ambiguity and (ii) polynomially ambiguous rational trace 
languages. 

Keywords: uniform random generation, approximate counting, context- 
free languages, auxiliary pushdown automata, rational trace languages, 
inherent ambiguity. 



1 Introduction 

In this work we propose a general framework to study the complexity of uniform 
random generation and counting problems for combinatorial structures repre- 
sented through an ambiguous specification, in the sense that the same object 
may admit several distinct descriptions. Structures of this kind occur frequently 
in different contexts. Typical examples are inherently ambiguous context-free 
languages, usually specified by generating context-free grammars, formal lan- 
guages accepted by nondeterministic machines, whose words can be represented 
by the accepting computations, combinatorial objects ambiguously represented 
by words over a given alphabet as, for instance, traces in free partially commuta- 
tive monoids. What often happens in these cases is that the counting problem for 
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the structure (i.e., determining the number of objects of given size) is a difficult 
problem, while computing the number of different descriptions of these objects 
is easy. 

As an example, computing the number of words of length n in inherently 
ambiguous context-free languages is complete for j)Pi (the restriction of j)P to 
functions with unary inputs) [^, while determining the number of derivation 
trees of words of length n in context-free grammars is easily solvable in poly- 
nomial time. It is worth noting that the last result holds even assuming the 
grammar part of the input [I tij . while the first problem is complete also for 
inherently ambiguous context-free languages of ambiguity degree 2. As a conse- 
quence of these negative results, in the case of ambiguous descriptions, the usual 
approach to random generation through counting should be avoided whenever 
we look for efficient algorithms. 

We recall that, in the case of unambiguous formal descriptions, the uni- 
form random generation and counting problems have been widely studied in the 
literature (see, for instance, miBi). In particular, due to their well-known appli- 
cations, the uniform random generation of unambiguous context-free languages 
has been considered in several papers [HEsiiiiiTni. This problem is implicitely 
treated in m as a special case of a more general analysis of algorithms for uni- 
form random generation of combinatorial structures specified by (unambiguous) 
formal grammars that involve operations of union, product, construction of sets, 
sequences and cycles. Many classical combinatorial objects can be specified in 
this way and the same analysis can be carried out for unambiguous context-free 
languages. The best general routine, assuming a fixed arbitrary grammar, gen- 
erates an object of size n uniformly at random in O(nlogn) time. This bound 
is in terms of arithmetic complexity: each step of the algorithm can require an 
arithmetic operation over 0(nlogn)-bits integers or it can generate in constant 
time an integer of 0(n) bits uniformly at random. 

On the contrary, uniform random generation and counting problems for struc- 
tures represented by ambiguous formalisms have not received much attention in 
the literature. An analysis of the complexity of uniform random generation for 
combinatorial structures defined by polynomial time relations is given in M 
The authors give some evidence that, under suitable hypotheses, (almost uni- 
form) random generation is easier than counting, but more difficult than recog- 
nizing. Recently, following a similar approach, a subexponential time algorithm 
is presented in m for the (almost uniform) random generation of words in a 
(possibly ambiguous) context-free language. 

In this paper we introduce a simple notion of description of a combinatorial 
structure, together with a corresponding notion of ambiguity, and study the 
problem of uniform random generation and approximate counting for structures 
endowed with such descriptions. We prove a general result (Section EJ stating 
that if a structure S has a description T with polynomially bounded ambiguity 
and T admits a polynomial time uniform random generator (u.r.g.), then also 
S admits a u.r.g. working in polynomial time. If further the counting problem 
for T is solvable in polynomial time, then S admits a fully polynomial time 
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randomized approximation scheme (r.a.s.) for its counting problem. Here, the 
proofs are based on the Karp-Luby technique for sampling from a union of 
sets m and on Hoeffding’s inequality a classical tool for bounding the tail 
probability of the sum of independent bounded random variables. Moreover, the 
computation model we use is essentially a RAM, under logarithmic cost criterion, 
equipped in addition with an unbiased coin tossing device. 

These results can be applied to various classes of languages: 

(1) We show (Section EJ that, for finitely ambiguous context-free languages, 
a word of length n can be generated uniformly at random in 0(n^ log n) 
time and O(n^) space, using O(n^logn) random bits. We observe that, in 
our model of computation, the same bounds for time and random bits are 
obtained for the uniform random generation of unambiguous context-free 
languages m Analogous bounds are obtained for the corresponding ran- 
domized approximation scheme. To prove these results we show in detail a 
multiplicity version of Earley’s algorithm for context-free recognition m 
We prove that for finitely ambiguous context-free languages, the number of 
derivation trees of an input word of size n can be computed in 0{n^ logn) 
time and 0{n^) space; 

(2) We show (Section EJ how to generate, uniformly at random, words from 
languages accepted by one-way nondeterministic auxiliary push-down au- 
tomata working in polynomial time and using a logarithmic amount of work- 
space lam. Also in this case, we obtain polynomial time u.r.g. and r.a.s. 
whenever the automaton has a polynomial number of accepting computa- 
tions for each input word; 

(3) We consider (Section |SJ the uniform random generation and approximate 
counting of rational trace languages unEi Finitely ambiguous rational trace 
languages admit u.r.g. and r.a.s. of the same time complexity of the algo- 
rithms for their recognition problem. Analogously, we obtain polynomial 
time u.r.g. and r.a.s. for the rational trace languages that are polynomially 
ambiguous. 

2 Preliminary Definitions 

In this work, as model of computation, we assume a Probabilistic Random Access 
Machine (PrRAM for short) according to which the complexity of a procedure 
takes into account the number of random bits used by the computation; a similar 
model is implicitely assumed in 1231 where the complexity of random number 
generation from arbitrary distributions is studied. 

Formally, our machine is an augmented version of the standard RAM model 
equipped in addition with a 1-way read-only random tape and an instruction 
RND. The random tape contains a sequence r = ro,ri . . . of symbols in {0, 1} 
and, in the initial configuration, its head scans the first symbol. During the com- 
putation, the instruction “rnd f” transfers the first i unread bits from the ran- 
dom tape into the accumulator and moves the tape head i positions ahead. For a 
fixed sequence r on the random tape and a given input x, the output M{x, r) of 
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the computation of a PrRAM M, is defined essentially as in the standard RAM 
model; furthermore, by M{x) we mean the random variable denoting the out- 
put M{x,r) assuming r a sequence of independent random variables such that 
Pr{ri = 1} = Pr{rj = 0} = 1/2 for every i > 0. Hence the instruction “rnd j” 
generates an integer in {0, . . . ,2* — 1} uniformly at random. To evaluate the 
space and time complexity of a PrRAM computation we adopt the logarithmic 
cost criterion defined in PJ for fh® standard RAM model assuming, in addition, 
a time cost i for every instruction “rnd i” . 

Due to the restriction to unbiased coins, an algorithm in our model may fail 
to give the correct answer and in this case it outputs a conventional symbol _L. 
However, good algorithms should reduce as much as possible the probability of 
such an event. 

We think this machine takes the advantages of the two main models consid- 
ered in connection with the random generation of combinatorial structures, i.e. 
the “arithmetic” machine assumed in HH and the probabilistic Turing machine 
used in m- From one side, it is suited for the specification of algorithms at high 
level allowing an easy analysis of time and space complexity. From the other, 
it also allows to carry out a somehow realistic analysis of the procedures which 
does not neglect the size of operands, trying to reasonably satisfy the principle 
that every elementary step of the machine be implementable in constant time 
by some fixed hardware. 

2.1 Ambiguous Descriptions of Combinatorial Structures 

A combinatorial structure is a pair {S, I'D, where the domain S' is a finite 
or denumerable set and the size | • | : S ^ N is a function such that ^{s G 
S : |s| = n} is finit^ for every n G N Hg. Here, we implicitly assume the 
elements s G S admit a (recursive) binary representation such that each |s| is 
polynomially related to the length of its binary representation; this allows our 
model of computation to manipulate the elements of combinatorial structures. 
Given such a structure {S, | • |), we also denote by S„ the set {s G S : |s| = n} C S 
and define the census function Cs : N ^ N by Cs{n) = #S„. Then, we formally 
introduce the concept of ambiguous description'. 

Definition 1. (T, | • |) zs o description of {S, \ ■ |) via the function f : T ^ S, 
if f is a surjective function preserving \ ■ \, i.e. |/(t)| = |t| for every t & T. The 
ambiguity of the description is the function d : S ^ ~N defined by d{s) = ff{t G 
T : fft) = s}, for every s G S. We say that the description is ambiguous if 
d{s) > 1, for some s G S. Moreover, the description is said to be polynomial 
whenever f and d are computable in polynomial time and there exists some 
D G N such that d{s) = 0(|s|^). 

We now introduce the following notions in the spirit of IZH. 

Definition 2. An algorithm A is a uniform random generator (u.r.g.) for a 
combinatorial structure {S, I’D if for every n > 0 such that Cs{n) > 0, 



^ In this paper, to avoid confusion with | • |, jfA denotes the cardinality of the set A. 
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(i) A on input n returns a value A(n) S S' U {-L}, 

(ii) Pr{A(n) = s \ A(n) ^ _L} = 1/Cs{n) for every s G Sn and 
(Hi) Pr{A(n) = _L} < 1/4. 



Definition 3. An algorithm A is a randomized approximation scheme (r.a.s.) 
for the census function Cs of a combinatorial structure (S, | • |) if for every 
n > 0 such that Cs{n) > 0 and every e G (0, 1), 

(i) A on input n,e returns a value A{n,e) G Q uUL 
(a) Pr{(l — e)C's(n) < A(n, e) < (1 + e)Cs(n) | A(n, e) ^ _L} > 3/4 and 
(Hi) Pr{A(n, e) = _L} < 1/4. 

Moreover, a r.a.s. is said to be a fully polynomial time r.a.s. whenever it works 
in time polynomial in n and 1/e. 



Definition 4. An algorithm A is a randomized exact counter (r.e.c.) for a com- 
binatorial structure (S, I’D if for every n > 0 such that C's(n) > 0, 

(i) A on input n returns a value A{n) G N U {-L}, 

(ii) Pr{A(n) = Cs{n) \ A(n) _L} > 3/4 and 
(Hi) Pr{A(n) = _L} < 1/4. 

Observe that the constant 1/4 in the previous definitions can be replaced by 
every positive number strictly less than 1, leaving the same notions substantially 
unchanged. Similarly, in the last two definitions, by taking the median of the 
outputs of several runs of the algorithm the constant 3/4 can be replaced 
by every number strictly between 1/2 and 1. 

3 Uniform Random Generation and Approximate 
Counting 

We start by considering the uniform random generation problem. 

Theorem 1 . If a combinatorial structure (S, | • |) admits a polynomial (ambigu- 
ous) description {T, \ ■ |) and there exists a polynomial time u.r.g. for {T, \ ■ |), 
then there exists a polynomial time u.r.g. for (S, | • |). 

To prove this theorem, we show a stronger result derived by applying the Karp- 
Luby technique for sampling from a union of sets m 

Lemma 1 . Let (S',! • |) admit a polynomial (ambiguous) description (T, | • |) 
via f and let Tf{n) and Td{n) be respectively the computation time of f and its 
ambiguity d; moreover, assume d{s) = 0(|sp) for some D G ISi. If B is a u.r.g. 
for (T, I • I) working in Ts(n) time and using Rb(ji) random bits, then there exists 
a u.r.g. for {S, \ ■ |) working in time 0{nf^ + n'°(TB(n) + T/(n) + Td(n))) and 
using 0(nf^ + n^Rsin)) random bits. 
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Proof. Define, for the sake of brevity, D{n) = Kin^ + K 2 where ki,K 2 are two 
integer constants such that d{s) < D(|s|); let A be the following algorithm, where 
K > 0 is a suitable integer constant discussed later and 1cm/ denotes the least 
common multiple of a set / C N. 

input n 

m ^ lcm{l, . . . , D{n)}, £ <— [logm] 
t ^ 0, s ^ _L 

while i < kD{ti) and s = _L do 
i ^ i + 1 
t ^ B(n) 
if t yf _L then 

s ^ fit) 

generate r G {1, ■ • ■ , 2^} uniformly at random 
if r > m/d{s) then s ^ _L 

output s 

First of all, we focus on the computation time of A on input n. One can show 
that the time to precompute D{n), m and £ is 0{n^^). The time of each iteration 
equals 0{TB{n) + Tf{n) +Td{n)) plus the time 0{n^) required for the generation 
of r. Moreover, each iteration uses at most Rsin) and 0{n^) random bits to 
compute t and r respectively, hence the total amount of random bits used by A 
adds up to 0{n^^ + Rsin)). 

Now, we prove the correctness of A. Assume that C's(n) > 0 so that, since / 
is surjective, C't(u.) > 0 and let S and T be the random variables representing 
respectively the value of s and t at the end of a while iteration (observe that S 
and T are well defined since their value is independent of the outcomes of the 
preceding iterations). Then, S takes value s € S„ whenever T G f~^(s) C T„ and 
r < m / d{f (T)) . Moreover, by definition of B, there exists some 0 < <5 < 1/4 such 
that Pr{r = t} = (1 — (5 )Ct(u)“^ for every t G T„ and, since r is independent of 
S and T, Pr{r < m/c?(/(T)) | T = t} = d{f(t))~^m2~^^°^"^\ Hence, for every 
s G Sn, 



PrjS” = s} = ^ Pr{T = t, r < m/d(/(r))} 

= Pr{r <m/d(/(T)) I r = t}Pr{r=t} 

= d(s)(d(s)-im2-r'°s™T (1 - (i)C'T(n)-i) 

= (l-<5)m2-r'°s'"TC'T(n)-i 

which is independent of s. On the other hand, at the end of each while iteration, 
Pr{5 = T} = 1 - Pr/S” G S'„} = 1 - Cs{n) ((1 - 
since m2“ri°g™l > 1/2 and C't(u) < Cs{n)D{n), 

3 1 



Pr{S' = T} < 1 



8D(n)' 
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Finally, since Pr{A(n) = _L} = PrlS” = an integer k > 0 can be 

chosen such that, for every n > 0, 

Pr(A(„) = X) < (l - < 1/4^ 

Moreover, for every s G S„, it holds Pr{A(n) = s | A(n) yf _L} = Pr{5 = s | 
S _L} and, since PcjS” = s} is constant, independent of s, it is immediate to 
conclude that Pr{A(n) = s | A(n) y^ _L} = 1/Cs(n). □ 

Using Hoeffding’s inequality m, a similar result can be proved for the ap- 
proximate and exact counting problem. 

Theorem 2. Let (S', I’D be a eombinatorial strueture admitting a polynomial 
(ambiguous) description (T, j-l) and assume there exists a polynomial time u.r.g. 
for (T, I • I). IfCT{n) is computable in polynomial time, then there exists a fully 
polynomial time r.a.s. for Cs- If further Cxin) is polynomially bounded, then 
there exists a polynomial time r.e.c. for (S, | • |). 

4 Context-Free Languages 

Applying our general paradigm to context-free (c.f. for short) grammars, we 
can design simple u.r.g. and r.a.s. of census functions for inherently ambiguous 
c.f. languages. To this end, let G = (V,S,S,P) be a c.f. grammar, where V 
is the set of nonterminal symbols (we also call variables), E the alphabet of 
terminals, S G V the initial variable and P the family of productions. We assume 
G in Chomsky normal form 12DI without useless variables (clearly, every c.f. 
language not containing the empty word e can be generated by such a grammar) . 
Moreover, for every A G V , let Ta be the family of derivation trees with root 
labelled by A deriving a word in . It is easy to see that there are finitely 
many t G T$ deriving a given x G E~^; in the following, we denote by do{x) 
the number of such trees and call ambiguity of G the function dg : N ^ N 
defined by da{n) = max{dG(a:) : x G A”}, for every n G N. Then, G is said 
finitely ambiguous if there exists a fc G N such that dc{n) < k for every n > 0; 
in particular, G is said unambiguous ii k = 1. On the other hand, G is said 
polynomially ambiguous if, for some polynomial p(n), we have dain) < p{n) for 
every n > 0. 

Our idea is to use the structure (Ts, j-l) as ambiguous description of the 
language L generated by G, where, for every t G Ts, |t| is the length of the derived 
word. To this aim, we need two preliminary procedures: one for generating a tree 
of given size in Ts uniformly at random, the other for computing the degree of 
ambiguity dcix) for any word x G . The first problem can be solved by 
adapting the procedure given in HS|, based on the general approach to the 
random generation proposed in PI- 

Proposition 1. Given a context-free grammar G = {V, E, S, P) in Chomsky 
normal form, there exists a u.r.g. for {Ts, \ • |) working in 0{n^ logn) time, 0{n) 
space and using 0{n^logn) random bits. 
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4.1 Earley’s Algorithm for Counting Derivations 

The number of derivation trees of a terminal string in a c.f. grammar can be com- 
puted by adapting Earley’s algorithm m for context-free recognition. The main 
advantage of this procedure, with respect to the well-known CYK algorithm Ea, 
is that in the case of a grammar with bounded ambiguity, the computation only 
requires quadratic time on a RAM under unit cost criterion [Qq. 

Our algorithm manipulates a weighted version of the so-called dotted produc- 
tions of a grammar G = (V, S, S, P) in Chomsky normal form, i.e. expressions 
of the form A— • /3, where A £ V, a, /3 G (Y U V)* and A^ajd G P. Given 
an input string x = ai 02 . . . o„, the algorithm computes a table of entries Sij, 
for 0 < i < j < n, each of which is a list of terms of the form \A-^a ■ /?, t], 
where A-^a • /3 is a dotted production in G and t is a positive integer. Each pair 
[A^a ■ P, t] is called state and t is the weight of the state. 

The table of lists Sij computed by the algorithm has the following properties 
for any pair of indices 0 < i < j < n: 

1) Sij contains at most one state [A-^a-P, t] for every dotted production A^a-P 
in G; 

2) a state [A-^a ■ /3, t] belongs to Sij if and only if there exists 6 G V* such that 
S ^ ai . . . OiAS and a a^+i . . . a^; 

3) if [A-s-a ■ P,t] belongs to Sij, then t = #{a a^+i . . .aj}, i.e. the number 
of leftmost derivations a ^ a^+i . . .aj. 

Note that, since there are no e-productions, [A^a ■ P,t] G implies a = e 
for every 0 < i < n. Furthermore, once the lists Sij are completed for any 

0 "P i Pi j Pi n, the number of derivation trees of the word x can be obtained by 
the sum E[s^AB.,t]GSo,„^- 

The algorithm first computes the list S'o.o of all the states • o, 1] such 
that S AS for some 5 gV* . Then, it executes the cycle of Scanner, Predictor 
and Completer loops given below for 1 < j < n, computing at the j-th loop the 
lists Sij for 0 < j < j. To this end the procedure maintains a family of sets 
Lb,i for B G V and 1 < i < j; each Ls,i contains all indices k < i such that a 
state of the form [A-^a ■ BP, t] belongs to Sk,i for some AGV,a,PGVU {e}, 
t G N. Moreover, during the computation every state in Si^j can be unmarked 
or marked according whether it can still be used to add new states in the table. 

The command “add D to S'ij” simply appends the state D as unmarked to 
Sij and updates Lb,i whenever D is of the form [A— ua • BP,t\, the command 
“update [A^of • P, t] IN 5'i j” replaces the state [A— • P, u] in Stj for some 
It G N by [A— • P,t], at last, “mark D in transforms the state D in Si^j 
into a marked state. 

1 for j = 1 . . . n do 

Scanner: 

2 for f = j — 1 . . . 0 do 

3 for [A— > ■ a,t] G do 
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4 MARK [A— > • a,t] IN Sij-i 

5 if a = ttj then add [A^a-,t] TO Stj 



Completer: 



6 

7 

8 

9 

10 
11 
12 
13 



for i = j — 1 . . .0 do 

for [_B— G Sij do 

MARK [_B— IN Stj 
for k G Lb,i do 

for [A— • B[3, u] G Sk,i do 
if [A—faB ■ f3,v] G Sk,j 

then UPDATE [A—>aB ■ /3,v+tu] IN Sk,j 
else ADD [A^aB ■ f3, tu] TO Skj 



Predictor: 

14 for z = 0 . . . j — 1 do 

15 for [A— HT • B(3, t] G Sij do 

16 MARK [A^a ■ Bp, t] IN Sij 

17 for i?^7 G P do 

18 if [B^ • 7, 1] ^ Sjj then add [B^ ■ 7, 1] TO Sjj 

19 while 3 unmarked [A-^ ■ Bp, t] G Sjj do 

20 mark [A^ • Bp, t] IN Sjj 

21 for P^7 G P do 

22 if [B^ ■7,1] ^ Sj,j then add [P— > • 7, 1] TO Sjj 

It can be shown that, at the end of the computation, properties ^ and 0) 
above are satisfied (the proof carries over as for the membership problem 0 
Section 4.2.2]). Now, let us prove property0. First note that all states in Sij, 
for 1 < i < j < n, are marked during the computation. Hence, we can reason 
by induction on the order of marking states. The initial condition is satisfied 
because all states in each Sjj, 0 < j < n, are of the form [A^ • a,t] and have 
weight t = 1. Also the states of the form [A^a-,t] have weight t = 1 and again 
statement 0 is satisfied. 

Then, consider a state D = [A— • BP,w\ G Sk,j {k < j). We claim that w 
is the number of leftmost derivations aB Ok+i ■ ■ ■ Oj. A state of this form is 
first added by the Completer at line I I 31 This means that there exists a set of 
indices Ik such that for every i G Ik there is [A— • BP, m] G Sk,i with m G N, 
and a family Ui of states [B^"f-,t] G Si^k such that 



- = E E 



tu^. 



iGlk ,t]^Ui 



Observe that fc < i < j for every i G Ik and Ui is the subset of all states in 
Sij with a dotted production of the form Moreover, each state \A-^a ■ 

BP,Ui\ G Sk,i is marked at line II 61 or EHl before D is added to Sk,j- Also all 
states in Ui, for all i G Ik, are marked during the computation of the weight w. 
Observe that, due to the form of the grammar, updating such weight w cannot 
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modify the weight of any state in [7^. As a consequence all the sates in Ui are 
marked before D. Hence, by inductive hypothesis, we have for every i G Ik 



Ui — Uk+1 ■ • ■ (1) 

and, for each G Ui, 

t Oi-i-i . . . Oj }. (2) 

Now the number of leftmost derivations aB ak+i ■ ■ ■ aj is clearly given by 
'y ' Ufe+i ■ ■ ■ Oi} y ^ ^ Oi+i ■ ■ ■ Oj } 

k<i<j B^'y^P 



and the claim follows from statement^ and equalities m and 0 . 

Finally, studying the behaviour of the algorithm, one obtains the following 

Proposition 2. Given a context-free grammar G in Ghomsky normal form, the 
number of derivation trees of a string of length n can be computed in polyno- 
mial time. If the grammar G is finitely ambiguous, then the algorithm works in 
O(n^logn) time and 0{n^) space (under logarithmic cost criterion). 



4.2 Inherently Ambiguons Context-Ftee Langnages 

Now, let L C A* be a c.f. language. We recall that L is unambiguous if it is 
generated by an unambiguous c.f. grammar, while it is inherently ambiguous 
whenever every c.f. grammar G generating L is ambiguous. We also say that L 
is finitely (respectively, polynomially) ambiguous if it is generated by a finitely 
(respectively polynomially) ambiguous c.f. grammar. 

Observe that there are natural examples of polynomially ambiguous c.f. lan- 
guages which are not finitely ambiguous. For instance, if L = {ww^ : w G 
{a, 5}*}, then is inherently ambiguous, but not finitely ambiguous Sec- 
tion 7.3]: however, it is easy to verify that is generated by a grammar of 
ambiguity 0{n). 

Hence, applying Propositions [D and El to Theorems [H and El and Lemma Q 
we obtain 

Theorem 3. If L is a polynomially ambiguous context-free language, then there 
exists a polynomial time u.r.g. for L and a fully polynomial time r.a.s. for its 
census function Cl. If further Cl is polynomially bounded, then there exists a 
polynomial time r.e.c. for L. Moreover, if L is a finitely ambiguous, then both the 
u.r.g. and the r.a.s. work in 0{n^logn) time (on a PrRAM under logarithmic 
cost criterion). 
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5 One-Way Nondeterministic Auxiliary Pushdown 
Automata 



In this section we apply our scheme to languages accepted by one-way nondeter- 
ministic auxiliary pushdown automata (1-NAuxPDA, for short). We recall that 
a 1-NAuxPDA is a nondeterministic Turing machine having a one-way read- 
only input tape, a pushdown tape and a log-space bounded two-way read-write 
work tape [21 0 ■ It is known that the class of languages accepted by polynomial 
time 1-NAuxPDA corresponds exactly to the class of decision problems that are 
reducible to context-free recognition via one-way log-space reductions M- 
Given a 1-NAuxPDA M, we define by dM{x) the number of accepting compu- 
tations of M on input x € H*, and call ambiguity of M the function dM : N — > N 
defined by dM{n) = ma.x{dM{x) : x G If"}, for every n G N. Then, M is said 
polynomially ambiguous if, for some polynomial p{x), we have dM{n) < p{n) for 
every n > 0. Moreover, it is known that, given an integer input n > 0, a c.f. gram- 
mar Gn can be built, in polynomial time in n, such that L(G„)nlf" = L{M)C\S'^ , 
where L{Gn) C S* is the language generated by G„ and L{M) C S* is the lan- 
guage accepted by M. This allows us to apply the results of the previous section 
to the languages accepted by 1-NAuxPDA. 

Here, we describe a modified version of the usual construction of G„ |0| which 
allows to bound the ambiguity of Gn with respect to the ambiguity of M. First 
of all, we assume w.l.o.g. that the automaton cannot simultaneously consume 
input and modify the content of the stack, at most one symbol can be pushed or 
popped for each single move, there is only a final state and, finally, the input is 
accepted iff the automaton reaches the final state with the pushdown store and 
work tape both empty. 

A surface configuration of a 1-NAuxPDA M on input of length n is a 5- 
tuple {q,w,i, r, j) where q is the state of M, w the content of its work tape, 
1 < i < |t(;| the work tape head position, T the symbol on top of the stack and 
1 < J < 1 the input tape head position. Observe that there are surface 

configurations on any input of length n > 0. Two surface configurations Gi,G 2 
form a realizable pair (Gi,G 2 ) (on a word y G ) iff M can move (consuming 
input y) from C\ to G 2 , ending with its stack at the same height as in Gi, 
without popping below this height at any step of the computation. If (C\,D) 
and and (D,C 2 ) are realizable pairs on y' and y" respectively, then (Gi,G 2 ) is 
a realizable pair on y = y'y" . Let 5^n be the set of surface configurations of 
M on inputs of length n and define the c.f. grammar Gn{M) = {N, E, S, P) 
where N = {S'} U |(Gi,G 2 ,i') : Gi,G 2 G and £ G {0,1}} and the set P of 
productions is given by: 

(1) S^(Gin, Gfin, 0) and S^(Gin, Gfi„, 1), where C\„ and Gfin represent respec- 
tively the initial and final surface configuration of M; 

(2) (Gi, G 2 , 0)^cr G P iff (Gi, G 2 ) is a realizable pair on cr G I7U{e} via a single 
move computation; 

(3) (Gi, G 2 , 0)^(Gi, P, 1){D, G 2 ,£) G P, for £ G {0, 1}, iff Gi, G 2 , P G 
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(4) e P, for {0,1}, iff Ci, Z?i, £> 2 , C 2 G Di can 

be reached from C\ in a single move pushing a symbol a on top of the stack 
and C 2 can be reached from D 2 in a single move popping the same symbol 
from the top of the stack. 

One can prove that the grammar G„(M) is computable in polynomial time 
on input n > 0 and the derivation (Gi,G 2 ,£) 4> y holds in Gn{M) for 
some ^ G (0, 1} iff (Gi, G 2 ) is a realizable pair on y in M; here, the derivations 
with i = Q (respectively, ^ = 1) correspond to those computations where the 
stack height somewhen equals (respectively, never equals) the stack height at the 
extremes Gi, G 2 . This correspondence allows to bound the ambiguity of G„(M), 
according to the following proposition the proof of which is omitted. 

Proposition 3 . For every polynomial time 1-NAuxPDA M, the number of left- 
most derivations (Gi,G 2 ,^) ^ y in Gn{M) for ^ G (0, 1} is less than or equal 
to the total number of computations from G\ to G2 consuming y. 

Now, observe that Proposition ^ and 0 can be rephrased assuming the c.f. 
grammar G as part of the input, still obtaining polynomial time algorithms. 
Hence, by Theorem 0 and 0 Proposition 0 leads to the following 

Theorem 4. Let L be the language accepted by a polynomial time 1-NAuxPDA 
with polynomial ambiguity. Then there exists a polynomial time u.r.g. for L 
and a fully polynomial time r.a.s. for its census function Gl- If further Cl is 
polynomially bounded, then there exists a polynomial time r.e.c. for L. 



6 Rational Trace Languages 

Another application of our method concerns the uniform random generation 
and the census function of trace languages. To study this case we refer to nm 
for basic definitions. We only recall that, given a trace monoid M(A,/) over 
the independence alphabet (S,I), a trace language, i.e. a subset of M(A, J), is 
usually specified by considering a string language L C E* and taking the closure 
[L] = {t G M(i7, 1) :t= [x] for some x G £}. In particular, a trace language T C 
M(A, J) is called rational if T = [L] for some regular language L C E* . In this 
case we say that T is represented by L and the ambiguity of this representation 
is the function : N ^ N, defined by dL{n) = #(£ n [a;]). We say 

that a rational trace language T is finitely ambiguous if it is represented by a 
regular language L such that, for some fc G N, drin) < k for every n > 0; in 
this case we say that T has ambiguity k. 

In the following, assuming a given independence alphabet (E,I), we denote 
by the set of all rational trace languages in M(A, I) and, by the subset of 
trace languages in M of ambiguity k. Clearly, for every independence alphabet 
{E, I), we have C ^%2 C ■ • • C C 

The properties of these families of languages have been studied in the litera- 
ture 0. In particular, it is known that ^ if and only if the independence 
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relation I is transitive; on the other hand, if I is not transitive, then we get the 
following chain of strict inclusions: C ^2 5 ' ’ ’ S U^i C 

Furthermore, we say that a rational trace language T is polynomially ambigu- 
ous if it is represented by a regular language L such that, for some polynomial 
p{n), we have dL^n) < p{n) for every n > 0. Observe that there exist exam- 
ples of polynomially ambiguous rational trace languages which are not finitely 
ambiguous. For instance, fixing / = {(a, 5), (6, c)}, if L = {a* c)* {ab)* c{a* c)* , 
then it turns out that [L] does not belong to however, [L] is poly- 

nomially ambiguous with dL^n) = 0{n) since, for every x of the form x = 
. . . {ab)^^ c . . . a^*c with k\,. . . , fcg, . . . fct G N, it holds 

Lf][x\ = {af^^c. . . {ab)^'c . . . a^*c ■. ki = kg,l <i < t}. 

Now, let us go back to our problem: here, we want to use L as an ambiguous 
description of [L], Then, also in this case, we first have to design two routines: one 
for generating a word in L uniformly at random and the other for determining 
the number of representatives of a trace in a given regular language L. The first 
routine can be easily obtained in the same vein of PI- The other algorithm is 
given by adapting a procedure for solving the membership problem for rational 
trace language jSlE|. As a consequence, we get the following 

Theorem 5. Let T C M(i7, /) be a finitely ambiguous rational trace language 
and assume I tt). Then, T admits a u.r.g. working in 0(n“ log n) time and 
using O(n^logn) random bits on a PrRAM under logarithmic cost criterion, 
where a is the size of the maximum clique in {E,I). Moreover, there exists a 
r.a.s. for the census function of T of the same time complexity. On the other 
hand, if T is polynomially ambiguous, then it admits a polynomial time u.r.g. 
and a polynomial time r.a.s. for its census function. 
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Abstract. We study several interesting variants of the fc-server problem. 
In the CNN problem, one server services requests in the Euclidean plane. 
The difference from the fc-server problem is that the server does not 
have to move to a request, but it has only to move to a point that 
lies in the same horizontal or vertical line with the request. This, for 
example, models the problem faced by a crew of a Certain News Network 
trying to shoot scenes on the streets of Manhattan from a distance; the 
crew has only to be on a matching street or avenue. The CNN problem 
contains as special cases two important problems: the bridge problem, 
also known as the cow-path problem, and the weighted 2-server problem 
in which the 2 servers may have different speeds. We show that any 
deterministic on-line algorithm has competitive ratio at least 6-1- vTf. We 
also show that some successful algorithms for the fc-server problem fail to 
be competitive. In particular, we show that no natural lazy memoryless 
randomized algorithm can be competitive. 

The CNN problem also motivates another variant of the fc-server problem, 
in which servers can move simultaneously, and we wish to minimize the 
time spent waiting for service. This is equivalent to the regular fc-server 
problem under the Coo norm for movement costs. We give a \k{k + 1) 
upper bound for the competitive ratio on trees. 



1 Introduction 

Consider a CNN crew trying to shoot scenes in Manhattan. As long as they are 
on a matching street or avenue, they can zoom in on a scene. If a scene happens 
to be at an intersection, the crew has two choices: street or avenue. Of course, 
the crew must make its choice on-line, without knowing where the subsequent 
scenes will be. 

This is an example of an interesting variant of the /c-server problem. We 
can formulate the CNN problem as follows: there is one server in the plane 
which services a sequence of requests (points of the plane). To service a request 
r = (ri,r2), the server must align itself with the request either horizontally or 
vertically, i.e, it must move to a point of the vertical line x = r\ ov & point of 
the horizontal line y = T2- The goal is to minimize the total distance traveled 
by the server. In the on-line version of the problem the requests are revealed 
progressively. 
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H. Reichel and S. Tison (Eds.): STAGS 2000, LNCS 1770, pp. 581-E22I 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 



582 Elias Koutsoupias and David Scot Taylor 



A more interesting formulation of the CNN problem results by assuming that 
we have 2 servers, moving in different dimensions, independent from each other. 
This allows us to generalize the problem as follows: there are two metric spaces 
Ml and M 2 with one server in each one. A request is a pair of points (xi,X 2 ) 
with Xi G Mi. To service the request, we have to move only one server to the 
requested point of its space. We will call this problem the sum of two 1-server 
problems. The CNN problem is the special case where both metric spaces Mi 
and M 2 are lines. 

More generally, let ti, T 2 , . . . , be task systems 0 (not necessarily distinct). 
We can synthesize these task systems to get two new interesting on-line problems: 
the sum and the product of ti,T 2 , . . .. The sum is the problem where we get 
requests (tasks) for each task system and we have to service only one of them. The 
product, on the other hand, is the problem where we have to service all of them. 
When all task systems are identical, Ti = r, the product is related to randomized 
on-line algorithms for the task system r. It is a trivial fact that a deterministic 
algorithm for the product of n copies of r, with each request the same across 
all spaces, is not different than a randomized algorithm against an oblivious 
adversary with exactly n (equiprobable) random choices; these algorithms are 
called barely random, or mixed strategies, in the literature 0. 

The CNN problem belongs to the class of sum problems. In fact, it is one of 
the simplest conceivable sum problems and a stepping stone towards building a 
robust (and less ad hoc) theory of on-line computation. Despite its importance, 
there is no definite positive result for this problem. The lack of results should 
be attributed to the hardness of the problem rather to the lack of interest. The 
problem has been known to the research community for quite a few yearfl The 
CNN problem and more generally the sum of on-line problems also give flexibility 
to model problems which the /c-server problem cannot. For instance, while the 
A:-server problem has been used to model the behavior of multiple heads on 
a disk, the CNN problem can be used to model retrieving information which 
resides on multiple disks. This, for example, happens when we replicate data to 
achieve higher performance or fault tolerance !ll4llbl2(Hjl| . Each disk may have 
information in completely different locations, leading to independent costs for 
information retrieval. We wish to minimize time spent looking for data, but do 
not actually care which disk the information comes from. In contrast, writing 
must be performed to all disks; this is closer in spirit to the product on-line 
problem mentioned above, but in our worst case analysis, there will be little or 
no writing at all. 

In this work, we use competitive analysis mrm to evaluate the quality 
of on-line algorithms; the competitive ratio is defined to be the worst-case per- 
formance of the on-line algorithm, compared to the optimal cost for the same 
sequence of requests. More precisely, algorithm ALG is c-competitive if there is a 
constant a such that over any finite input sequence p, ALC(p) < c • OPx(p) -|- a, 
where OPT(p) is the optimal cost for p. The game-theoretic structure of the 

^ The CNN problem was originally proposed by Mike Saks and William Burley, who 
obtained some initial results. The name (cnn) was suggested by Gerhard Woeginger. 
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competitive ratio suggests considering the on-line algorithm as a strategy that 
competes against an optimal “adversary” , who selects the requests and services 
them too. 

In this work, we show some negative results (lower bounds and failed attempts 
to “import” the fc-server theory to this problem). There are no known upper 
bounds. Compare this with the /c-server problem: although the fc-conjecture has 
not been resolved yet, we now know the competitive ratio within a factor of 2 
nH . In particular, the 2-server problem was settled from the beginning . The 
CNN problem seems very similar to the 2-server problem. Yet, almost all known 
competitive algorithms for the fc-server problem fail for the CNN problem. 

We prove that there is no competitive natural lazy memoryless randomized 
algorithm, (Theorem CJ for the CNN problem. We also observe that if we restrict 
the requests to a line, the problem is equivalent to the weighted 2-server prob- 
lem m in a line. The weighted A:-server problem is the variant of the standard 
fc-server problem in which servers have different speeds and the cost is propor- 
tional to the time needed to service a request. We show (Theorem E| that any 
deterministic on-line algorithm has a competitive ratio at least 6 -I- vT? for the 
weighted server problem in a line (and thus for its generalization, the CNN prob- 
lem). This lower bound holds when one server is arbitrarily faster than the other. 
We also show that some obvious candidate algorithms are not competitive for 
the CNN problem (Propositional . Some of the results extend directly to the CNN 
problem in higher dimensions (the sum of more 1-server problems), in which 
the lower bounds of also apply. It is easy to show that for the sum of any 
fc non-trivial spaces (at least 2 points each) a 2^ — 1 lower bound on the ratio 
exists, and that for the simplest spaces (fc spaces of 2 points each, 1 unit apart), 
this bound is tight. 

Finally, we study a variant of the fc-server problem, motivated by the CNN 
problem. Instead of trying to minimize the cost of moving servers, we try to 
minimize the time spent waiting for service, but we allow multiple servers to 
move simultaneously ini. When a request is made, the on-line algorithm specifies 
possible movement for each of the fc-servers, and tries to minimize the cost of 
their total movement, under the Coo norm. For fc-servers in a tree, we determine 
the exact ratio ifc(fc -I- 1) of the dc-tree algorithm of . In particular for 
fc = 2, we show that dc-tree is optimal with competitive ratio 3. 

2 Memoryless Randomized Algorithms 

Our first result concerns randomized algorithms for the CNN problem. Within 
the plane, we define a natural memory less algorithm to be an algorithm which 
has the following property: The prohahility of satisfying any request by moving 
the shorter distance is only a function of the ratio of the shorter distance to the 
longer distance. We break this into several subproperties: 

— Translational Invariance: The algorithm does not consider what point its 
server is at, only relational positions of points within the space (such as the 
distances to the request). 
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— Symmetry 1: The algorithm is unbiased towards horizontal and vertical 
moves. More precisely, let (x,y) be the on-line position and consider two 
possible next requests at (x + a,y + b) and {x + b,y + a). If the server moves 
horizontally with probability p to service the first request, it moves vertically 
with the same probability p to service the second request. Similarly, request 
{x — a^y + b) also moves horizontally with probability p. 

— Scalability 1: Let (x,y) be the on-line position and consider two possible 
next requests at (a; -I- o, y -I- b) and {x + Ao, y + Xb), for some A. The server 
moves horizontally with the same probability for both requests. 



We call these properties natural because they reflect the fact that for this 
problem, the plane is symmetric over y = x, y = 0, and a; = 0, and is scale 
and translation invariant. We also require that the algorithm be lazy: if the 
algorithm is at position (x,y), and the request is at (rx,ry), it will service the 
request by moving to either (r^,, y) or (a;, Vy). It is standard to require memoryless 
algorithms to be lazy, otherwise the on-line algorithm can try to encode history 
information within its non-lazy movement. Although this encoding is impossible 
for natural algorithms, we nevertheless require the laziness property. 

Unlike the fc-server problem for which a natural lazy memoryless algorithm, 
HARMONIC, has finite competitive ratio (0(2^ log A:), [,41 1 no such algorithm 
for the CNN problem is competitive. 



Theorem 1. There is no competitive natural lazy memoryless randomized al- 
gorithm for the CNN problem. 



Proof. Fix some memoryless algorithm with optimal competitive ratio. Assume 
that the adversary is at position (0, 0) (by Translational Invariance). We want to 
estimate the on-line cost when the adversary produces a worst-case sequence of 
lazy requests, defined as requests for which it does not have to move, i.e., points 
from the lines a; = 0 and y = 0. Let <P{x,y) be the expected cost of the on-line 
algorithm to converge to (0,0) starting at position (x,y). 

It follows for the above properties of the on-line algorithm that satisfies: 



- Symmetry 2: <T{x,y) = <T{\y\, |a;|). 

— Scalability 2: <?(A ■ x, X ■ y) = X ■ T>{x, y). 



To prove our theorem, we examine various possible combinations of on-line 
positions and requests. We start by considering <?(l,(p), where p = , the 

golden ratio. We try to find recursive bounds on <P. The following table summa- 
rizes combinations on-line positions, requests, and resulting bounds on 
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On-Line 

Position 


Next 

Request 


Resulting 

Inequality 


Justification 


(1.<P) 


(‘P^0) 


= hW ■ 1) + </5) -b i(^>(i, 0) -b (fi) 

= f<Z>(l,(b>) + i<?(l,0) + <p 


Symmetry 1, 

(fi'^ -l = ip 

Scalability 2, 

(fi'^ -l = ip 

Symmetry 2 


(1,0) 


(0,1) 


^(1,0) 

> i(<Z>(l,l) + l) + i(<Z>(0,0) + l) 

= Hi,i) + i 


Symmetry 1 


(1,1) 


(0,</5) 


<?>(!, 1) 

> p(<l>(l, (p) + (p - 1) + (1 - p){<P{0, 1) + 1) 

= p(<l>(l, ip) + ip-l) + {l- p){<P{l, 0) + 1) 


For some p. 
Symmetry 2 


(1,0) 


(0,</5) 


<^>(1,0) 

>p(<l>(0,0) + l) + (l-p)(<l>(l,(b>) + (/p) 
= (1 -P)^(l,<b>) +P+ (1 -P)<P 


Symmetry 1, 
Scalability 1 



If we assume that is bounded we get a contradiction: from the first 

three bounds we get p < tp — 1 and from the first and last p > p — 1. Therefore, 
if the initial position is (0,0) and the adversary moves to position (1,^), the 
on-line cost to converge to (1, p) is unbounded. This shows that the competitive 
ratio is unbounded. □ 



3 Server Problems with Different Speeds 

We now turn our attention to the restricted CNN problem where all requests are 
from a line (the server is still allowed to move anywhere in the plane). A lower 
bound for the restricted problem is naturally a lower bound for the unrestricted 
one. Without loss of generality, we assume that the line is of the form y = mx, 
for some constant m. For m = 1, the problem is equivalent to the standard 
2-server problem in a line, where moving in the x dimension corresponds to 
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moving one of the servers, and the y dimension the other. Changing m gives a 
more interesting problem: if all requests are restricted to the form (x,mx), it 
corresponds to a request in a line at x for a 2-server problem, but this time the 
servers have different costs for movement. Loosely, this can be interpreted as 
having servers with different speed^ and we wish to minimize the total delay, 
i.e., the time requests wait for service. 

In general, the restriction of the multidimensional CNN problem where all 
requests are from a line is equivalent to the weighted k-server problem in a line. 
The general weighted server problem was studied in m This work gives a lower 
bound of for any metric space with at least fc -|- 1 points (and arbitrary 

speeds). No upper bound is known for arbitrary metric spaces, but lid I gives 
a doubly exponential upper bound (2^ ) for uniform metric spaces; this is 

reduced to exponential when the servers have at most 2 different speeds. 

For the restricted version of the CNN problem, the weighted 2-server problem 
in a line, we show a deterministic lower bound of 6 -I- vTz. The surprising fact 
exploited in the proof is that the adversary can “simulate” the bridge (or cow- 
path) problem [2|2dj . Thus, the CNN problem contains as a subproblem another 
fundamental on-line problem. Interestingly, we know of two different ways to 
view the bridge problem as a special case of the CNN problem. This shows the 
close connection between these two problems, although the CNN problem is, of 
course, much more complicated. 

The BRIDGE problem is a simple on-line problem, in which an explorer comes 
to a river. There is a bridge across the river, but it is not known how far away it 
is, or if it is upstream or downstream. The explorer must try to find the bridge 
while minimizing movement. The optimal solution involves alternating between 
the upstream and downstream directions, exploring 1 distance unit downstream, 
then 2 units (from the original starting position) upstream, then 4 downstream, 
and continuing in powers of 2 until the bridge is found. This strategy results in 
total movement no more than 9 times the distance from the original position to 
the bridge (plus a constant). 

Theorem 2. For the 2-server problem in a line, in which one server is m times 
faster than the other, the deterministic competitive ratio is at least 6-|-vT7. More 
precisely, for any e > 0, there is a sufficiently large m such that the competitive 
ratio is at least 6 -I- vT? — e. 

Proof. We first show a weaker lower bound of 9 to exhibit the relation between 
the GNN problem and the bridge problem. The role of the explorer will be 
played by the slow server. 

^ To make this interpretation strict, we consider that we are only allowed to move 
one server at a time, which is natural for our problem motivation (we can only 
move along streets or avenues). It is also worth considering this question without 
this restriction, which corresponds to using the Chebychev distance (jCoo) instead of 
the Manhattan distance (/ii). The fc-server variant of this problem, where servers 
are allowed to move simultaneously, was introduced in as the min-time server 
problem. We study this in Section 0 
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We shall assume m >> 1. Also, without loss of generality, we can assume that 
the on-line algorithm is lazy (moves a server only to service requests). Let [l,r] 
be the interval of the line explored (visited so far) by the slow server. Initially, it 
is safe to assume that the slow server is at r = 0 and the fast server is at Z = — 1. 
When the on-line slow server is at r the adversary’s strategy distinguishes two 
cases: if the fast server is to the right of r, the next request is at 1; if the fast 
server is to the left of r, the next request is at r -I- 6, where 6 is an arbitrarily 
small positive distance. The adversary’s strategy when the slow server is at I is 
symmetric. 

This adversary’s strategy forces the slow server to explore larger and larger 
portion of the line. The slow server at r cannot continually move to the right, 
because this would result in competitive ratio m. Eventually, to achieve ratio less 
than TO, the slow server will need to go to point 1. The adversary can continue 
to force the slow server to “zig-zag” , exploring larger and larger segments of the 
line; exactly as the explorer does in the bridge problem. 

Assume that at the end of the game, the slow server has explored the interval 
\—z, y], for z > y, and has just moved from — z to y. Since the competitive ratio 
of the bridge problem is 9, the total distance moved by the slow server must 
be at least 9y (minus an insignificant term). On the other hand, the adversary 
can service all requests by moving its slow server to y and its fast server to —z. 
Its total cost is y -I- z/m^ which for large to is approximately y. (To be fair, to 
should be fixed before the on-line algorithm is forced to choose 2 values, but we 
can always guarantee that z is no more than some modest constant multiple of 
y or else we again reach ratio to.) 

Thus, the competitive ratio is at least 9 — e, where e tends to 0 as to tends 
to oo. To simplify the presentation, we henceforth will drop the e term. 

It takes a little effort to improve the bound to 10. Observe that we ignored 
the cost of the on-line fast server. Let xq be this cost. The adversary has the 
alternative strategy to service all requests with its fast server. Its total cost then 
is xq. Thus, the competitive ratio is at least 

,9y + xo 9y + xo 

max( , ). 

xo,y Xo y 

The minimum value, 10, is clearly obtained when xq = y. 

To get the improved lower bound 6 -I- vT?, we extend the above sequence of 
requests. Let p be the sequence of requests described above (which completes the 
“bridge simulation”) and consider the request sequence p((0, — z)^’(y, — z)^’)” 
where i varies from 1 to n, and ji (ki) refers to the number of times the first 
(second) pair are repeated during the repetition of the whole phrase. Let ji 
and ki, be determined as follows: after servicing p, the on-line slow server is at 
y. To service (0, — z)-^’^, it may use the fast server for a while, but eventually, it 
must move its slow server to 0. Let ji be the number of repetitions of (0, — z) 
needed to make the on-line slow server move to 0 for the time, and ki be the 
number of repetitions of (y, — z) needed to make the on-line slow server move 
back to y for the time. Let a; 2 i-i = jiz/m, the movement of the on-line fast 
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server before the move to 0, and X 2 i = kiz/m, the movement of the on-line 
fast server before the move back to position y. 

The adversary might end the game after any of the slow server moves. The 
first few terms of the ratio R are: 



max( 

xi,y 



Qy + xp 

y 



lOy + xp + xi 

Xp 



lly + Xp + xi + X 2 
y + xi 



I2y + Xp + xi + X 2 + Xp 

Xp + X2 



and in general: 

(10 + 2j)?/ + Eiio' 

2/ + ELi^2.-1 ’ ELo^2. 

The different denominator types are from the two off-line strategies we have 
already seen for the p requests: moving both servers, or just the fast one. The 
adversary can choose either of these twc0. 

It turns out that to minimize the expression above, we can assume that all 
values are equal, and the equations simplify to: 

xo -I- a;i -I- lOy _ xp + ^y 

Xp y 

w ^ o Xi-i+Xi + 2y xp + 9y 

Xi-i y 

Scaling y to be 1, and solving, we get: 



X\ = x^ + Sxp — 10 



i-2 

Vi >2, Xi = {xp + Sy~^xi - 2 -I- 8)1 

J=o 

Vz >2, x^ = {xp + - 2 ^ 

xp + 7 a;o -I- 7 



and finally 

Vz >2, Xi = {xp + 8)*“^(a;o -I- 8a;o — 10 — 



;) + 



a;o -I- 7 xp + 7 



All Xi values must be positive, so the smallest possible value for the equations 
is when x^ + Sxp — 10 — = 0. The only positive root is xp = vTz — 3, which 

gives the stated bound of 6 -I- vTz. □ 

® In fact, the off-line servers have a third main strategy to consider, which is moving 
the slow server to point z and servicing the rest of the queries with the fast server. 
By eliminating this option, we may be proving a slightly weaker lower bound than 
possible from these three strategies; we can show however that the third strategy 
does not improve the lower bound by more than 0.22. The equations are complicated 
by the fact that once the third option is added, it is no longer safe to assume that 
the optimal bridge solution for p will give the best solution to the equations. 
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To see just how difficult it is to find a competitive ratio for the CNN problem, 
we notice that some simple algorithms which are competitive for the 2-server 
problem are not competitive for the case when the servers have different move- 
ment costs. The “double coverage” (dc) algorithm in a line is the following 
simple algorithm: if the request is between the two servers, move both towards 
it until the request is served. Otherwise, move the closer server (ties broken ar- 
bitrarily.) The “balance”, or bal algorithm is also simple: to answer any query, 
move the server which will have the minimum cumulative cost if it moves to the 
request. More general balance algorithms base their decision to move a server to 
a request on two parameters: the cumulative cost of a server and the distance 
to the request. 

Proposition 1. For 2-servers in a line, one with speed 1 and the other m, no 
DC or BAL type algorithm has constant competitive ratio (bounded independently 
of m). 

Obviously these algorithms need to be modified to account for the different 
speeds of the servers: for instance, in DC it must be possible for the fast server 
to “pass” the slow server for requests outside of their convex hull, or else it 
is trivial to achieve competitive ratio at least equal to m. We expect that the 
statement holds for any variant which maintains the spirit of either algorithm, 
and use a simple example as an intuitive justification. For DC, consider an on-line 
configuration at (0, x), where the fast server is at 0, and x large enough that the 
slow server will reach a request at a; -I- 1 before the fast one {{x-\-l)/m > 1 will do 
for the most natural generalization of DC) By repeating the sequence of requests 
0,a;,0,a: -I- 1, the on-line servers will pay cost 2 for every 4 requests, while an 
adversary could satisfy them with just cost 2/m (assume m > 1). By making m 
large, we can get an arbitrarily large competitive ratio. A similar example can 
be used for bal. 

Leaving our fc-server interpretation behind, we can achieve the same results 
for the CNN problem by considering requests which lie in the two lines y = 0 and 
y = 1. For this special case of the CNN problem, a slightly weaker lower bound 
was obtained by William Burley and the first author. Now any request can be 
satisfied at a cost of 1 by moving from one line to the other. We use a strategy 
similar to before. Suppose the leftmost and rightmost positions of our server so 
far have been [l,r\. If the server is on y = 0, place a request at {I — 1, 1), and 
if the server is on y = 1, place a request at (0,r -|- 1). We use these requests to 
again make the server move away from the center just as before, simulating the 
BRIDGE explorer, and vertical movement corresponds to movement of the fast 
server in the previous argument. At the end, the adversary can again make the 
server alternate between the origin and the shorter “arm” of exploration. All 
equations are the same. 

4 The fc-Server Problem under the C^o Norm 

In this section, we consider the /c-server problem, where servers have the same 
speed, but can be moved simultaneously, and time of service is to be minimized. 
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Once a request is served, the next request is given. By ignoring the fact that our 
algorithm (and the optimal one) can move more than one server simultaneously, 
we can achieve a competitive ratio for this problem which is k times larger 
than for the regular fc-server problem. Using the best known bound of HH], this 
gives a 2k^ — k bound for this ratio, but we expect that this can be improved. 
For example, within the uniform metric space, moving the servers in order will 
achieve the (optimal) ratio of k. 

We show that in a tree, the competitive ratio is no worse than ifc(fc+ 1). We 
employ the dc-tree algorithm of |^, generalized from DC of 0. The algorithm 
is defined as follows: move each of the servers with an unblocked (by other 
servers) path to the request towards the request at a constant speed. Note that 
servers may begin moving, and later stop moving as they become blocked by 
other servers which move onto their path. 



Theorem 3. For k servers in a tree under the Coo norm, dc-tree has com- 
petitive ratio ^k{k + 1). 

Proof. Let Mmin be the distance for the best matching between the on-line 
and off-line servers, and Udc-tree be the distance between all pairs of on-line 
servers. To show that dc-tree is competitive, consider some phase in which a 
fixed number of servers are moving, and use the following potential: 



<F = 



{k l)Mmin 
2 



-Udc-tree. 

2 



The analysis is as in 0. For a move of cost d, the off-line server can increase 
Mmin by kd, giving a ratio of \k{k -|- 1). 

To show this ratio is tight for dc-tree, consider servers (on-line and off- 
line) at position (2,4,..., 2k) of a line. For a cost of 1, the off-line algorithm can 
move all of its servers to (1, 3, . . . , 2fc— 1). The adversary is lazy, and will at each 
time request its uncovered server which is at the lowest value (i.e., the sequence 
request will be 1, 3, 1, 5, 3, 1, 7, 5, 3,1,...). Each request will cost dc-tree 1 to 
serve, and it will take ^k{k-\-l) total requests to converge to the off-line position, 
at which time we are at a position similar to the original one. □ 



The following lemma shows that the dc-tree algorithm is optimal for A: = 2 
and Euclidean metric spaces. 

Lemma 1. No on-line algorithm has competitive ratio less than 3 for the Eu- 
clidean 2-server problem under the Loo norm. 



Proof. It suffices to consider lazy on-line algorithms. Consider requests in a line, 
with initial configuration (on-line and off-line) {0,2}. If there is a request at 1, 
by symmetry, the on-line algorithm can service it by moving the server at 2, and 
it can move the other server to any point in the interval [—1, 1] at no extra cost. 
Now the adversary requests point 3 (and reveals its configuration {1,3}). The 
on-line algorithm must pay at least 2 to service the request. The total on-line 
cost is 3, but the off-line cost is 1 (move the 2 servers together from {0,2} to 
{1,3}). This configuration is similar to the initial one, and the situation can be 
repeated indefinitely. Therefore, no on-line algorithm has ratio less than 3. □ 
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In El, the DC-TREE algorithm for k = 2 can be extended to any metric 
space, not just trees. This is done by considering an extension of the metric 
space to its “closure” so that any three points are connected by a tree. Consider 
“virtual” movement through the closure of the space, and only actually move a 
server when it needs to service a request, which will be a point in the original 
space. By the triangle inequality, the actual movement will be less than the 
total virtual movement. Unfortunately, under the £oo norm, this strategy does 
not work: except for the last move (to the original space), the virtual movement 
is for free, with the cost for the move being dominated by the movement of the 
other server. When it comes time to move the server to answer a request, this 
virtual movement is no longer available for free. Triangle inequality can no longer 
guarantee that the actual movement is less than the sum of virtual movement. 

5 Conclusions and Future Work 

We have introduced several interesting variants of the fc-server problem, with 
the power to model new problems. There are numerous open problems left. We 
mention only few of them here. 

The CNN problem is wide open. We conjecture that it has a finite competitive 
ratio. In fact, we conjecture that the generalized Work Function Algorithm (the 
one that moves a server which minimizes Xw{A') + d(A, A')) has constant com- 
petitive ratio for the CNN problem for any A > 1 (A = 3 seems a good candidate); 
we conjecture the same for the sum of two 1-server problems in general. 

The off-line CNN problem seems interesting in its own right and as a stepping 
stone for the on-line problem. More precisely, we want to find simple and fast 
approximation algorithms for the off-line CNN problem. 

For the Euclidean fc-server problem under the Cca norm, the intuition behind 
DC-TREE suggests that there be an algorithm with ratio better than 4 (which 
follows from the fact that 2 servers are 2 competitive), though it must be at least 
3 (by Lemma nj). 

We believe that for all three problems (sum of server problems, weighted 
A:-server, variant of the fc-server problem) the Work Function Algorithm 
has almost optimal competitive ratio. We don’t have a candidate randomized 
algorithm though. 
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Abstract. We consider a generalized 2-server problem in which servers 
have different costs. We prove that, in uniform spaces, a version of the 
Work Function Algorithm is 5-competitive, and that no better ratio is 
possible. We also give a 5-competitive randomized, memoryless algorithm 
for uniform spaces, and a matching lower bound. For arbitrary metric 
spaces, we prove that no memory less randomized algorithm has a con- 
stant competitive ratio. We study a subproblem in which a request spec- 
ifies two points to be covered by the servers, and the algorithm decides 
which server to move to which point; we give a 9-competitive determin- 
istic algorithm for any metric space (no better ratio is possible). 



1 Introduction 

In the weighted k-server problem we are given a metric space M with k mobile 
servers. Each server Si has a given weight Pi > 0. At each step a request r G M 
is issued. In response, one of the servers moves to r, at a cost equal to its weight 
times the distance from its current location to r. This is an online problem, in 
the sense that it is required that the algorithm decides which server to move to r 
before the next request is issued. An online algorithm A is R-competitive if, for 
each request sequence g, the cost incurred by A is at most R times the optimal 
service cost for g, plus an additive constant independent of g. The competitive 
ratio of A is the smallest R for which A is i?-competitive. 

The unweighted case, with all Pi = 1, has been extensively studied during the 
last decade. The problem was introduced by Manasse, McGeoch and Sleator |2U. 
who gave a 2-competitive algorithm for k = 2 and proved that A: is a lower bound 
on the competitive ratio of deterministic algorithms in any metric space. They 
also conjectured that there exists an fc-competitive deterministic algorithm for 
any metric space; this is now called the k-Server Conjecture. For k > 3, the best 
known upper bound is 2k— 1, by Koutsoupias and Papadimitriou HE|. An upper 
bound of k has been proven only in some special cases, including uniform spaces 
(i.e., spaces with all distances equal to 1), see [ 1911 1119 ) . 

Very little is known about randomized algorithms for k servers. No algorithm 
with ratio less than k is known for arbitrary spaces. In uniform spaces, the 
competitive ratio is Hk ~ In k, the fc-th harmonic number j 1 122) . For k = 2, when 
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the metric space is the line, Bartal et al 0 give a 1.987-competitive algorithm. 
The best known lower bound for 2 servers is l + « 1.6065 H3 Other lower 

and upper bounds for this problem can be found in |dl7| . 

The weighted case of the fc-server problem turns out to be more difficult. Fiat 
and Ricklin HZ! show that the competitive ratio is at least in any space 

with at least k+1 points. They also give a doubly-exponential upper bound for 
uniform spaces. For k = 2, Koutsoupias and Taylor I2DI, recently proved that no 
10.12-competitive algorithm exists if the underlying metric space is the line. For 
uniform spaces, Feuerstein et al Hi gave an 6.275-competitive algorithm. 

Our results. We study the case when fc = 2. In uniform spaces, we improve the 
upper bound from Hi by proving that a version of the Work Function Algorithm 
is 5-competitive, and we show that no better ratio is possible. We also give a 
5-competitive memoryless randomized algorithm for uniform spaces, as well as 
a matching lower bound. 

For arbitrary spaces, we prove that there is no memoryless randomized algo- 
rithm with finite competitive ratio. (A similar result was independently obtained 
by Koutsoupias and Taylor !zn|.) This contrasts with the non-weighted case, for 
which memoryless algorithms exist for any k. For example, the harmonic algo- 
rithm is competitive for any k and, for fc = 2, a memoryless 2-competitive 
algorithm is known fTTTTnj . 

Last, we propose a version of the problem in which a request is specified by 
two points, both of which must be covered by the servers, and the algorithm 
must decide which server to move to which point. For this version, we show 
a 9-competitive algorithm and we prove that no better ratio is possible. This 
generalizes the results for the 2-point 1-server request problem HU, as well as 
for the cow-path problem EEg. 

Adversary arguments and potential functions. We view the computation 
as a game between two players, the algorithm and the adversary. In each round, 
the adversary issues a request, serves it using its servers, and then the algorithm 
serves the request (without knowing the position of the adversary servers). A 
potential function assigns a real number to the current state. To serve our 
purpose, <P must satisfy the following three properties: (i) tk is bounded from 
below by some constant, (ii) if the adversary serves the request at cost d, then <P 
increases by at most Rd, and (iii) if the algorithm serves the request at cost d, 
then <P decreases by at least d. By summing over all requests, it follows that the 
algorithm is i?-competitive. Intuitively, one can think of as the credits that the 
algorithm has saved in the past and can use to pay for serving future requests. 

For randomized algorithms, property (iii) needs to holds on average, over the 
random choices of the algorithm. There are subtle differences between different 
adversary models, see 0. We use two models. The weaker, oblivious adversary, 
has to generate the whole request sequence in advance. The stronger, adaptive 
online adversary generates and serves the requests one by one, with the knowl- 
edge of the current positions of the algorithm’s servers. 

One useful principle for designing potential functions is that of the lazy po- 
tential. If the adversary continues to request the position of his servers, the 
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potential function has to provide enough credit to pay for all moves before the 
algorithm converges to the same server positions. Although the potential func- 
tions in our paper are not lazy, they are based on a similar idea, namely they are 
lower bounded by the cost of the algorithm in case the adversary will not move 
the expensive server. Formulas obtained by these methods are complicated, not 
easy to analyze, and not even guaranteed to give the best results. The potential 
functions we actually use are then simplified and improved by trial and error (in 
some cases also using computer experiments). 

Notation. Throughout the paper, without loss of generality, we assume that 
/3i = 1 and /32 = /3 < 1. Thus si and S 2 denote the expensive and the cheap 
server of the algorithm, respectively. Similarly, by oi and 02 we denote the 
expensive and the cheap server of the adversary. 

For any points x, y in the given metric space, xy denotes their distance. A 
metric space is uniform if the distance of any two distinct points is equal to 1. 



2 Randomized Memoryless Algorithms 

In this section we consider randomized memory less algorithms. Our model of 
a memoryless randomized algorithm is this: A memoryless algorithm is simply 
a function that receives on input the distances from each server to the request 
point r and the distance between the servers, and determines, for each i, the 
probability that r is served with s,. The algorithm only moves one server, and 
only to the request point. This is a natural requirement since, in certain spaces, it 
may be possible to encode memory states by perturbing the other server position. 

First we give a 5-competitive algorithm for uniform spaces and prove that 
it is optimal. The lower bound holds even for the weaker oblivious adversary, 
while the upper bound is valid against the stronger adaptive online adversary. 
For general spaces, we prove that no memory less algorithm can achieve a finite 
competitive ratio, even if the underlying metric spaces is the line. 

Both lower bounds are based on the following observation. Suppose that 
initially si is at point a and S 2 at c, and the adversary alternates requests to b 
and c, that have the same distance from a. Then the probability that si ends 
at c is at most 1/2. The reason is that as long as si stays on a, the situation 
remains identical from a viewpoint of a memoryless algorithm. The first request 
is on 6 so Si it is more likely to end up at b than at c. 

2.1 An Upper Bound for Uniform Spaces 

On average, our algorithm moves the expensive server after paying approximately 
3/2 for the moves of the cheap server. More precisely, it is defined as follows: 

Algorithm RANDOM: If the request is on si or S 2 , do nothing. Otherwise 
serve the request by si with probability p = 2/3/ (3 -I- /3) and by S 2 otherwise. 

Theorem 1. Algorithm RANDOM for the weighted 2- server problem in uniform 
metric spaces is {5 — fJ)- competitive against an adaptive online adversary. 
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Proof. Consider the algorithm described above. Define a potential function <1>: 



>P{si,S2,ai,a2) 



0 if oi = Si and 02 = S 2 , 

5(3 — (3‘^ if oi = Si and 02 ^ S 2 , 

5 — (3 if oi ^ Si and 02 = S 2 , and 

5 + 2/3/ (3 — (3) if ai ^ si and 02 ^S 2 - 



If the adversary moves, it is easy to check that, in each case, the potential 
increases by at most (5-/3) times the adversary cost. So it remains to prove that 
when the adversary requests one of his server positions, the expected change of 
plus the expected cost of the algorithm is at most 0. This is done by case 
analysis and straightforward calculation which we omit. 



2.2 A Lower Bound for Uniform Spaces 

Theorem 2. For any memoryless randomized algorithm for the weighted 2- 
server problem in a uniform space of three points, the competitive ratio against 
an oblivious adversary is at least 5. 

Proof. Let a, b, and c denote the three points. The algorithm has only one 
parameter, which is the probability p that a request unoccupied by a server is 
served by si. At the beginning, the expensive server is at a and the cheap one 
at b. We assume that /3 ^ 0, and all O-notation is relative to this. We prove a 
lower bound of 5 — o(l), the bound of 5 follows by taking /3 sufficiently small. 

First consider what happens if we repeat requests b and c infinitely long. 
Eventually, the algorithm moves si to 6 or c and S 2 to the other of these two 
points. The expected cost of the algorithm is l + /3/p, since it will take on average 
1/p of moves of the cheap server. We choose k large enough so that after the 
sequence (c6)^, the probability that si does not move is o(l) and the expected 
cost of the algorithm is 1 + (3/ p — o{l). At the end, the probability that si = b 
is at most 1/2, since the first (non-trivial) request was to the point c. 

Consider a sequence of requests {{cb)^{ab)^Y for I = u;(l). Let us call a 
subsequence of requests (cb)^ or (ab)^ a phase. Until the algorithm moves si to 
6, it pays 1 + /3/p — o(l) for each phase. It takes on average at least 2 — o(l) 
phases to move si to b, so the total cost of the algorithm is 2(1 + /3/p)(l — o(l)). 

The adversary strategy depends on the probability p. If p < 2/3/3 then the 
request sequence is the above sequence {{cb)^{ab)^Y for some I satisfying I = 
o(l//3) and I = w(l). The adversary serves the sequence by moving ai to b and 
then moving 02 between a and c 21 times, with total cost 1 + 21(3 = 1 + o(l). 
The cost of the algorithm is 2(1 + /3/p)(l — o(l)) > 5 — o(l). 

If p > 2/3/3 then the request sequence is (c6)™((ca)^(6a)^)*, where m and I 
are chosen such that m = o(l//3) = o(l/p), I = o{m), I = w(l). The adversary 
does not move oi and serves the requests at b and c with 02, at cost 2{l + m)(3 = 
2m/3(l + 0(1)). The probability that si does not move during the first 2m steps 
is (7 = (1 — p)^'". The expected cost of the cheap server during the first 2m steps 
is (3/p — q- (3/p (the cost would be (3/p for an infinite sequence; with probability 
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q we stop after m steps and save (i/p, which is the expected cost starting after 
2m steps conditioned on the fact that we have at least 2m steps). If si moves, 
the additional cost is 1 -I- 2(1 -|- (i/p){l — o(l)). So the total expected cost of the 
algorithm is at least (1 — g) • 3(1 -I- (i/p){l — o(l)). Elementary calculations show 
that 1 — q = 2mp{l — o(l)). Thus the competitive ratio is at least 



2mp-3(l + f)(l-o(l)) ^ Jp 
2m/3(l + o(l)) 



0(1)) > 5-0(1). 



2.3 A Lower Bound for the Line 

Theorem 3. There exists no competitive randomized memoryless algorithm for 
the weighted 2-server problem on the real line. 

Proof. Let i? > 1 be an arbitrarily large integer and (i = 2“^^. At the beginning, 
assume that S 2 and 02 are at 0 and si and oi are at 1. We have AR phases, phases. 
In a phase i, we alternate requests to points 2* and 0, the total of 2®-^ requests. 
The optimal algorithm moves oi to 0 and serves all other requests with 02 . The 
total cost is 1 -I- 2'^^ (3 = 2. 

Let Ci be expected cost of the algorithm in phases i -I- 1, . . . , 4i?, assuming 
that after i phases S 2 is at 0 and si is at 2L We prove by backwards induction 
that Ci > (4i? — i)2®“^. By definition. Cm = 0. For the induction step, we 
analyze phase i, assuming that S 2 starts at 0 and si at 2®“^. If si stays at 2®“^ 
then the cost is at least 2®^ • 2®/3 > 2^^ > (4i? — f)2®“^ in phase i alone. Assume 
now that si moves. If si moves to 2®, then the cost of this and all following 
phases is at least 2®“^ -I- Q+i. If si moves to 0 then the cost of this phase is at 
least 2®“^. Since si is more likely to move to 2® than to 0, the expected total 
cost is at least Ci > 2®“^ -I- Ci+i/2 > (4i? — i)2®“^. Thus the algorithm pays at 
least Co > 2R, and the competitive ratio is at least R. 



3 Deterministic Algorithms 

Now we focus on deterministic algorithms. We start by introducing work func- 
tions. Then, we give a 5-competitive algorithm for the weighted 2-server problem 
in uniform spaces and prove that the ratio 5 is optimal. In the last subsection 
we study a simplified version of the weighted 2-server problem; for this version 
we give a 9-competitive algorithm for an arbitrary metric space. 

Work Functions. A configuration is a pair {x,y), where x,y are the locations 
of the expensive and cheap servers, respectively. By ujg{x, y) we denote the work 
function on {x,y), defined as the minimum cost of serving the request sequence 
Q and ending in configuration {x,y). We can compute oj by dynamic program- 
ming as follows. Initially, uie{x,y) = xqx (3 ■ yoy, where (xo,yo) is the initial 
configuration. Let g = ar. If r G {x,y} then u>g{x,y) = uia{x,y). Otherwise, 
ujg{x,y) = ro.\n.{ijJa{r,y) xr,uja{x,r) (3 • yr}. Function uig satisfies the Lips- 
chitz condition: for any points x, y,u,v G M, ojg{u, v) < U!g(x, y) xu-\- P • yv. 
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We use modified work functions with only one parameter, which is the posi- 
tion of the expensive server, defined by U!g(x) = rniny ujg(x,y). Intuitively, as f3 
approaches 0, the position of the cheap server becomes less significant. 

For the algorithms based on work function we use the potential functions 
as follows. The potential is bounded from below by C — i?min(a; j,) u;(a;, y), for 
some constant C. We show that Acost + <P' < <P, where Acost is the cost of 
the algorithm (for one request or one phase), and <P and <P' are the old and 
the new potential, respectively. Summing over all requests, the algorithm’s cost 
is at most the initial constant minus the final potential. This is at most C + 
i?min(a; y) w(a:, y), and /^-competitiveness follows. 

3.1 The Work-Function Algorithm for Uniform Spaces 

For simplicity, we assume that // < 1 and 1//3 is an integer. This is not a major 
restriction, since we are interested in the asymptotic behavior for /? — > 0. Further, 
we assume that the adversary does not make any requests on si and S 2 ; otherwise 
the algorithm can simply ignore the request. 

The Work Function Algorithm (WFA) minimizes the current cost plus the 
optimal cost of the new configuration. Adapted to our case, it works as follows. 
Algorithm MWFA: Let u be the request to be served, let uj be the work 
function for the request sequence ending at u. Suppose the current configuration 
is {x,y). If oj(x) = oj(u) + I, move si from x, otherwise move S 2 from y. 

Let us now examine ojg(x) in more detail. Suppose that x ^ r, g = ar and 
let q be the last request in ct, y yf r. Function w satisfies the following Lipschitz 
condition: for all points x, y, 



Ug{x) <UJg{y) + l, (1) 

because if tOg{y) = Wg(y, z), then LOg{x) < u>g{x, z) < tOg{y, z) -I- 1 = tOg{y) + 1. 

For y ^ r,q, we have LOg{x,y) = min{w,^(r, y) -|- l,u;a-ix,r) + (3} > LOa{x,r) = 
ujg{x,r). Therefore, 

LZig{x) = min {ujg{x,r),u]g{x,q)} = min {cv g{x, r), u g{r, q) + 1} (2) 

We have ujg{x) < ujg{x, r) = 0Ja{x, r) < 0Ja{x) + /?. Due to integrality of 1///, 
either ojg{x) = uJa{x) or ujg{x) = 0Ja{x)+p. Note that this also implies that each 
phase has at least 1//3 requests. The next lemma shows that in typical cases 
uj{x) increases. 

Lemma 1. Let g = ar and let q be the last request in a, y yf r. 

(a) If X ^ {q, r} and ojg(x) < min {ojg(r),ojg(q)} + 1 then ujg(x) = oja-(x) + j3. 

(b) For (3 <1, if s ^ {y, r} and ojg(r) = min(wg), then ojgs(r) = ojg(r) + (3. 

Proof, (a) We have ujg{x) = C 0 g{x,r), for otherwise we would get cog^x) = 
u;g{r,q) + l > LOg{r) + l. Thus LOg{x) = 0Jg{x,r) = min{ujo-{x, q) + P, Ucriq, r) + 1} . 
Since ujg{x) < ujg{q) -I- 1 < 0 Jg{q, r) -I- 1 = Wo-(y, r) -|- 1, we get ujg{x) = 0Ja{x, y) -I- 
P > (^a-ix) + P. 
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(b) We have ujgs{r) = min {wg(r, s),o;g(s, r) -|- 1} = Wg(r, s) = mm{uja{r, q) + 
f3, uJa{q, s) -I- 1} > min{wg(r, q) + (3, iOg{q, s) -h 1 - /?} > min{wg(r) -|- (3, iOg{q) + 
!-/?}> min(wg) + (3 = Ug{r) + (3. 



Theorem 4. For (3 < 1, 1/ (3 integer, MWFA is a 5-competitive algorithm for 
the weighted 2-server problem in uniform spaces. 

Proof. We divide the computation into phases, with each phase ending when 
Si moves. We define the potential function for configurations at the beginning 
of each phase. Let to be the work function at the beginning of the phase and 
let r be the last request; thus si is at r. If MWFA moved si to r from z then 
u;{z) = uj{r) 1. Thus, by m, uj{-) is minimized at r. Let a and b be the next 
two minima of uj, that is, a yf r and oj(a) — minx:^r oj(x), and b {r,a} and 
uj{b) = min,c^{r,a} w){x). We define the potential as 

<P = max {— w(r) — 4 • w(o), 2 — w(r) — 4 • w(6)} 

Consider one phase, in which the work function changes from uj to /i, and let 
s be the last request in this phase. Thus /i(r) = /r(s) -I- 1, and s is the minimum 
of pL. Let c and d be the next two minima of /i. 

We need to show that Acost where Acost is the total cost of the 

algorithm during the phase, and are the old and the new potentials. 

Without loss of generality, we assume that o;(r) = 0 (since we can uniformly 
decrease u and p. by w(r)). Lemma ^ implies that during the phase the work 
function on r increases by (3 on each request except the last (Lemmanb applies 
to the first request in a phase, and Lemmata to all intermediate requests.) 
Since MWFA pays j3 for each such request, Acost < p{r) — w(r) -|- 1 = 2 -|- p{s). 
Since s ^ r, we have /i(s) > o;(s) > uj(a). We distinguish two cases. 

Case 1 : = —p{s) — 3p{c). If c = r then p{c) = p{s) 3 - 1 and 

Acost 3- 'P' < [2 -I- /r(s)] -I- [— 4 — 5^(s)] < — 4^(s) < — 4w(a) < P 

If c yf r, then p{c) > oj{b), since p{c) > oj(c) and p(c) > p{s) > uj{s). Thus 

Acost 3- P' < [2 -b /i(s)] -I- [—^(s) — 4^(c)] = 2 — 4/r(c) < 2 — 4u;(5) < P 

Case 2 : P' = 2 — p{s) — 3p{d). If r G {c, d}, then p{d) > p{r) = p{s) -3- 1, so 

Acost -\-P' < [2 -b /r(s)] -b [— 2 — 5b((s)] = — 4^(s) < — 4w(a) < P 

Otherwise r, s, c, and d are all distinct. We claim that 

p{d) > min |a;(a) -b 1, w(6) + ^ } . 

This implies that 

Acost 3- P' < [2 -b p{s)] + [2 — p{s) — 4/r(d)] = 4 — 4,p{d) 

< 4 — 4min |o;(a) -b 1, w(5) -b = max {— 4w(a), 2 — 4w(6)} = P 
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To prove the claim, suppose that /r(d) < uj(a) + 1. We prove by induction 
that for any work function ^ during the phase 

+ ^(c) + ^(d) > ^(r) + 2cj(b) (3) 

Initially, ^ = tu and uj{r) = 0. Since at least two of w(c), w(d), w(s) are at least 
O’ (6), the inequality holds. 

Consider one step in the phase. Suppose first that some t G {s, c, d} is maxed- 
out after the request, that is ^(t) = l+min^; ^(x). Denote the other two points by 
y and z, so that ^{y) < ^{z). Since /r(t) < /i(d) < w(a) + l, we have ^(t) = ^(r)+l. 
But then ^(t) + £,{y) + ^(z) = ^(r) + [^(y) + 1] + ^(z) > ^(r) + 2w(6). 

If the new work function is not maxed out on any of s, c or d. Lemmata 
implies that the work function must increase on at least one of s, c, d. The right- 
hand side can increase at most by f3, so inequality m is preserved. 

At the end of the phase we have C = M and 2/r(d) > y,{c)+fi{d) > fi{r) — fi{s) + 
2uj{b) = 1-1- 2uj(b). So we showed that y(d) < w(a) -I- 1 implies /r(d) > uj{b) + i. 

The proof for a complete phase is now finished. The last phase may be in- 
complete. Then the cost of the algorithm is at most the increase of the work 
function, and the theorem follows. 

3.2 A Lower Bound for Uniform Spaces 

We use a space with three points a, 6, c. The expensive server starts at a and the 
cheap one at b. The adversary always requests the point not occupied by si, S 2 - 
We again divide the request sequence into phases, where a phase ends when the 
algorithm moves si. 

Let U be the number of requests served by S 2 in phase i. Thus the algorithm 
pays Ufd -I- 1 in this phase. Let uii(x) be the modified work function (as defined 
earlier) after i phases. 

Now we define (pi = 2[o;i(a) +uJi{b) +iXi{c)] — min{a;i(a), Wi(6), Wi(c)}. This 
function is approximately five times the optimal cost. We need to prove that it 
is a lower bound on the cost of the algorithm. 

Lemma 2. For every phase i, we have <pi — (pi-i < UP -I- 1 -I- 6/3. 

Proof. Let u be the position of si at the beginning of the phase. Let U = 
(jJi-i{ui), and V = uji-i{v), where v G {a, b, c} — {«,} is the point with the smaller 
value of uji-i- During the phase, the adversary alternates the requests to the two 
other points than u. From the definition of the work function it is easy to see that 
iOi{u) < min {V + 1 + P,U + {ti + l)/3} and, for x ^ u, ujpx) < uji-i{x) + p. We 
distinguish three cases. 

Case 1 : U + UP < V. Then uji(u) < U+ (ti + 1)P and u achieves the minimum of 
bothwi-i and Wi (up to an additive term of /3), so < {U+UP+(iP) — U = 

tiP -\- 6/3 < tiP -|- 1 -|- 6/3. 

Case 2 : V < U. Then V is the minimum of u>i-i and an approximate minimum of 
LUi, so (pi — (pi— I = (2<jJi(M)-l-4/3) — 2U < (([/-l-ti/3)-l-(U-l-l)-l-6/3) — 2U < ti/3-|-l-l-6/3. 
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Case 3 : U < V < U + ti(3. Now U is the minimum of Wi_i and V is the ap- 
proximate minimum of u>i. Thus (pi ~ <Pi-i < (2tUi(rt) + V + 4/3) — {2V + U) < 
{{U ti/3) (V -l- 1) + + 6/3) — (2V U) = ti/3 -I- 1 + 6/3. 

Theorem 5. For any deterministic algorithm for the weighted 2-server problem 
in a uniform space with three points, the competitive ratio is at least 5. 

Proof. Let k be the number of phases, C the cost of the algorithm, and Copt 
the optimal cost. By summing over all phases. Lemma El implies that C 6kp > 
<pk — <Po ~ 6fc/3 > bCopt — 4. Since the algorithm pays at least 1 in each phase, 
we have C > k, and the theorem follows by taking j3 sufficiently small. 



3.3 The Weighted 2-Point Request Problem 

In this section we study the modification of the weighted 2-server problem in 
which each request is specified by two points, say {r, s}. In response to this 
request, the algorithm must move one server to r and the other to s. The decision 
to be made is which server to move to which point. 

This can be viewed as a subproblem of the weighted 2-server problem: Re- 
place the 2-point request {r,s} by a long sequence (rs)*. Any competitive 2- 
server algorithm eventually moves his servers to r and s. In this way any R- 
competitive weighted 2 -server algorithm yields a i?-competitive algorithm for 
the weighted 2 -point request problem. 

On the other hand, in the limit for /3 ^ 0, we obtain the 2-point request 
problem with one server studied in iia, which in turn contains the closely related 
cow-path problem EE31- This yields a lower bound of 9. We prove a matching 
upper bound of 9 for the algorithm WFA 3 which at each step minimizes the 
cost of the move plus three times the optimal cost of the new configuration. 
Algorithm WFA 3 : Let {x, y) be the current configuration, let {r, s} be the new 
request and w' the new work function. If xr-\~l3ys-\-3uj'{r, s) < xs-l-/3yr-l-3oj'(s, r), 
then move to (r, s), otherwise move to (s,r). 

Theorem 6. WFA 3 is R-competitive for the weighted 2-point request problem 
in any metric space, where R = (9 — 3/3)/(l -I- /3). 

Proof. Let {x, y) be the current configuration and let u>x and ujy the work func- 
tion values when the expensive server is at x and y, respectively. Let (r, s), 
and oj'g be the new configuration and work function values, after serving the re- 
quest {r, s}. Denote the distances as follows: d = xy, e = rs, a = xr, b = xs, f = 
yr, and g = ys. The cost of the algorithm for serving the new request is Acost = 
a -\- fdg and the new work function is w). = min {wx + a -I- I3g,ujy + / + /3b}, 
w' = min {ujx -\- b -\- (3f, ujy -\- g -\- /3a}. 

From the definition, WFA 3 satisfies the invariant 3(uJx—ojy) < (l-l-/3)d. Since 
in the current step WFA 3 moved to (r,s) and not (s,r), we have a-h/3g-l-3uj} < 
b + /3f -h 3a;' . We use the following potential function: 



= 



2d- 



6 

1 + /3 



-I -h 



(cVx ^y) 



RiOx 
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where [a]'*' = max(a, 0). We need to show that Acost + A<P = a+Pg + <P' — <P < 0. 
We distinguish several cases according to the possible values of <P' , w' , and w'. 
In each case, the last step uses a triangle inequality. 

Case 1 : (1 + I3)d + 3(w' — w') < 0. Then 
Case 1.1 : = ojx + a + fig. Then 

Acost + A<1> < a + j3g — R{ujx + a + j3g) + Ruux < 0 



Case 1.2 : = Wy + / + pb. Then 

Acost + A<P < a + Pg — R{ujy + f + Pb) — 



2 d + 



1 + /? 



(^tOx ^y) R^x 



3 — 3/? 

= a + Pg - Rf - RPb -2d+ ^ ^ - tOy) 

< a + Pg — f — pb — 2d + — P)d 

= {a — d — f) + P{g — d — b) < 0 



Case 2 : (1 + P)d + 3(w' — w') > 0. 

Case 2.1 : uj'^ = Ux + a + Pg and w' = + 6 + /?/. Then a + Pg <b + Pf and 

6 



Acost + A<P < a + Pg + 2 e + 



1 + /? 



{a + Pg -b- Pf) - R{u>x + a + Pg)+ Ru>x 



-\{e - b - a) + P{e - f - g) + P{a + Pg) - {b + Pf)] < 0 



l+P' 

Case 2.2 : = u>x + a + Pg and w' = Wy + g + Pa. Then 

6 



Acost + A<P < a + Pg — 



2 d + 



1 + /? 



(oJx ^y) ddjjJx 



2e -\- 



6 



1 + /?^”" ' ” ' ^ 
= 2 {e — d — a — g) — — P)g < 0 



{ujx + a + Pg — LOy — g — Pa) — R{oJx + a + Pg) 



Case 2.3 : oj). = ujy + f + Pb and uj'^ = ojy + g + Pa. Then Acost + A<P is at most 

6 



a + Pg + 



2e T 
2 d + 



l + P"~^ 
6 



1 + /? 

1 — 5/? 6 — P — P'^ 

i+p 

6 - /? - /?2 



f pb — ujy — g — Pa) — R{ujy H- / H- pb) 
(u^a; ^y) RuJq 



^ 3-3/?, , 3-3/?, 

g -\- 2 e — 2 d-\- ^ ^ ^ {ujx — ujy) — ^ ^ / 



1 H“ /? 



1 H“ /? 



< a — 



1 H“ /? 



3 — 3/? 

^ + 2e — (1 + P)d — p if + pb) 



- Pp {e- f - g) + {a- e- d- g) < 0 
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Case 2.4 : = ujy + f + j3b and = uj^ + b + jSf. Then Acost + A<P is at most 



a-\- Pg- 






2d ^ p (j-^x ^y) RuJx 



— a + pg + 2e — 2d H- 3(ccJ^ — uj'P) — - — j — — ^y) — “i — i — 3"(‘^s — 



1 -\- P 



1 -\- P 



< a + Pg + 2e-2d+ {b + Pf - a- Pg) - + p ^^ 

= 2{e-d-b-f)-4{l-P)f < 0 



4 Final Comments 

We proved mainly results for uniform spaces. Many open problems remain. For 
example, no competitive algorithm for the weighted 2-server problem in arbi- 
trary spaces is known, and the lower bound of 10.12 from m can probably be 
improved. This lower bound also shows that the optimal competitive ratio for 
the weighted 2-point request problem is strictly smaller than for the the general 
weighted 2-server problem. Technically, the 2-point request problem is simpler 
to analyze, since the work function has only two relevant values. 

The weighted 2-server problem is related to the CNN problem from ^3 . In 
this problem, we have one server in the plane. Each request is a point (x, y), and 
to serve this request we need to move the server to some point with x-coordinate 
X or with y-coordinate y. The special case when the requests are restricted to 
some line in the plane is equivalent to a weighted 2-server problem on the line. 

Koutsoupias and Taylor m also prove a lower bound for memoryless ran- 
domized algorithms for the CNN problem. Our lower bound for memoryless 
algorithms is somewhat stronger, in two respects: the problem is a special case 
of the CNN problem and, unlike in 1201 , we do not assume that the algorithm is 
invariant with respect to scaling distances. 
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Abstract. The fc-server problem is one of the most fundamental on- 
line problems. The problem is to schedule k mobile servers to serve a 
sequence of service points in a metric space to mimize the total mileage. 
The fc-server conjecture im that states that there exists an optimal k- 
competitive on-line algorithm has been open for over 10 years. The top 
candidate on-line algorithm for settling this conjecture is the Work Func- 
tion Algorithm (wfa) which was recently shown I7I9I to have competitive 
ratio at most 2k — 1. In this paper we lend support to the coujecture that 
WFA is in fact fc-competitive by proving that it achieves this ratio in sev- 
eral special metric spaces. 



1 Introduction 

The fc-server problem m together with its special case, the paging problem, is 
probably the most influential on-line problem. The famous A:-server conjecture 
has been open for over 10 years. Yet, the problem itself is very easy to state: 
There are k servers that can move in a metric space. Their purpose is to service a 
sequence of requests. A request is simply a point of the metric space and servicing 
it entails moving a server to the requested point. The objective is to minimize 
the total distance traveled by all servers. In the on-line version of the problem, 
the requests are presented one-by-one. The notorious fc-server conjecture states 
that there is an on-line algorithm that has competitive ratio fc on any metric 
space. The top candidate on-line algorithm for settling the fc-server conjecture is 
the Work Function Algorithm (wfa) which was shown to have competitive 
ratio at most 2fc — 1. 

In this paper, we show three results. The first is that the wfa is fc-competitive 
in the line. Our second result is that the wfa is fc-competitive for the “symmetric 
weighted cache” (represented by weighted star instances) . It was known m that 
the fc-server conjecture holds for these instances, but the algorithm employed was 
not the WFA, but the Double Coverage algorithm, which has no natural extension 
for non-tree like metric spaces. Our third result is a new proof of the WFA is fc- 
competitive for metric spaces of fc-|-2 points. This was first shown in ECDI using 
an involved potential. Our proof here uses a much simpler potential. 

* Supported in part by NSF grant CCR-9521606 
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There is an interesting connection between the three results of this work. 
In all cases, the number of minimizers (to be defined later) is at most k + 1. 
Although this fact by itself cannot guarantee that the WFA is ^-competitive, it 
is at the heart of our proofs. 

2 Preliminaries 

We summarize here our notation, conventions and definitions. For a more thor- 
ough discussion that includes the history of the problem see . Let p = 
ri . . . r„ be a request sequence. The work function Wi(X) is defined to be the 
optimal cost for servicing r\. . .ri and moving to configuration X. The Work 
Function Algorithm works as follows: Let Ai be its configuration just before ser- 
vicing request ri+\. To service ri+i, it moves to configuration that contains 
Ti+i and minimizes Wi+i{Ai+i) + d{Ai, Ai+i). 

Chrobak and Larmore pl introduced the concept of extended cost of the WFA 
(which they call pseudocost): The extended cost for request is equal to the 
maximum increase of the work function: maxx{wi+i(A') — Wi(A)}. They showed 
that the extended cost is equal to the on-line plus the off-line cost (see also P|). 
Consequently, to prove that the Work Function Algorithm is c-competitive, it 
suffices to bound the total extended cost by {c+ l)OPT(p) -|- const, where OPT(p) 
is the optimal (off-line) cost to service p. 

For general metric spaces, the best known upper bound on the competitive 
ratio for the A:-server problem is 2/c — 1 l/iuj (see also 0 for a simpler proof), 
which improved the previous exponential (in k) bounds [hll j . Unlike the previous 
results, the algorithm employed in m to establish the 2fc — 1 bound is the 
WFA. The proof is based on some fundamental properties (Quasiconvexity and 
Duality) of work functions. Here we will make use of the Duality property which 
characterizes the configurations that achieve the maximum maxx{uii +i{X) - 
Wi{X)}. 



Lemma 1 (Duality lemma | 0 ^)- X be a configuration that minimizes 

m{X) - ^ d{ri+i,x). 

x&X 



Then X minimizes also 



Wi+i{X) — ri+i,x) 

xGX 



and maximizes the extended cost 



max{wj+i(A) - w;j(A)}. 

A configuration X that minimizes Wi{X) — "Ylix^x t>e called a mini- 

mizer of p with respect to Wi . 
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3 The WFA for the Line 

In this section, we will show that the WFA is ^-competitive in the line. To simplify 
the presentation, we assume that all requests are in a fixed interval [a,b]. Let 
us denote the configuration that contains m copies of a and k — m copies of b 
as We shall call these configurations extreme. Observe that there are 

exactly A: -I- 1 extreme configurations (m = 0, . . . ,k). The next lemma shows that 
we can generally assume that minimizers are extreme configurations. 

Lemma 2. Assume that all requests are in the interval [a,b\. For any point 
p G [a, b] and any work function Wi, there is m G {0, ... ,k} such that 
is a minimizer of p with respect to Wi. 

Proof. Clearly, there is a minimizer X of p with respect to Wi with all points 
in the interval [a,b\. Assume that there is a point x G X in the interval [a,p]. 
What will happen if we slide x to o? The work function Wi(X) can increase by 
at most d(a,x) while the distance of x from p will increase by exactly d(a,x). 
Therefore A — a; -I- a is also a minimizer of p. More precisely, 

Wi(X — X + a) — E y} — d(ci^ x)) ( ^ ^ d(pj 7/) -|- x)) 

yGX -x-\-a 

= Wi{X) - ^ d{p,y) 
v&x 



Similarly, we can slide all points of X to either a or b. If X has m points in [a,p], 
then is a minimizer of p. □ 

Theorem 1. The WFA is k-competitive in the line. 

Proof. We first show the somewhat simpler result that the WFA is ^-competitive 
in an interval [a,b]. The same proof extends to the infinite line. 

We define a potential to be the sum of Wi on all extreme configurations: 

k 

j=0 

We will show that is an upper bound (within a constant) of the extended cost. 
By Lemma|21 there is m such that is a minimizer of with respect 

to Wi. The increase of the potential, <^^+1 — d>i, is equal to the increase of the 
work function on all extreme configurations. Since the work function increases 
monotonically, i.e., Wi+\{X) > Wi(X), the increase — d>i of the potential 
is at least Wi+i(a™b^~"^) — Wi(a™b^ "*), which is the extended cost to service 
Ti+i. It follows, by telescoping, that the total extended cost, i.e., the sum of the 
extended cost for all requests, is bounded from above by Fn — <Po. 

For a fixed interval [a, 6], the values of a work function cannot differ too 
much: for any work function w and any configurations X and Y : w{X) — w(Y) < 
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d{X,Y) < kd{a,b). This allows us to conclude that is equal (within a con- 
stant) to {k + 1)opt(p„) = (k + 1) minx{w(X)} and that is constant. The 
total extended cost is therefore bounded above by {k+ l)OPT(p„) -|- const which 
implies the fc-competitiveness of WFA. 

We now turn to the infinite line. We have to be more careful for this case 
and to actually compute the constants ignored in the previous paragraph. Let’s 
first observe that we can again assume that all requests are in an interval [a, b] 
where a is the leftmost request of and b is the rightmost one. The difference 
with the case of a fixed interval [a, b] is that now we cannot assume that d{a, b) 
is constant. Thus we have to show that the additive term depends only on the 
initial configuration Ag and is independent on d{a,b). We can easily compute 
the initial potential , Aq). It is easy to 

see that the last expression is equal to b) — |Ao|, where |Ao| is the sum 

of the distances between all pairs of points in A^: |^o| = \ a 2 eAo ^ 2 )- 
Similarly, if A„ is the final configuration of the optimal off-line algorithm, then 

< {k + l)wn{An) + b) — \An\- It follows that the extended cost is 

bounded above by ^„-^o < (A:-|- l)w„(^„) - |^„| -I- |^o| < (A:-|- l)w„(^„) -I- |^o|, 

which shows that the total extending cost is bounded above by (fc -|- l)oPT(p) -|- 
const and the proof is complete. □ 

4 The WFA for Weighted Cache 

It is well known that the problem of accessing pages in a weighted cache can 
be modeled by the fc-server problem on weighted star instaces (trees of depth 
1 ). The leaves of the star represent pages and the leaves where servers reside 
correspond to the pages in the cache. The weight on the edge from the leaf to the 
center is half of the cost for fetching the corresponding page into the cache (since 
the server has to pay this cost twice per passing thru that leaf). The center of 
the star is denoted c. 

We show that wfa is /c-competitive on such instances. 

Recall that a minimizer of a; is a configuration A that minimizes rrii{A, x) = 
Wi{A) — X^aeA 2;). It is easy to see that there is always a minimizer that 
does not include x. Define fj,i{A,x) as follows: If a: ^ 4 then fj,i{A,x) = Wi{A) — 
J2a(^A c)-d(c, a;); otherwise, if a; G A, let /Xj(4, x) = w^{A)-J2a(^A-x '^)- 

Since ^i{A,x) = mi{A,x) + {k — l)d{c,x), we have that a configuration A is a 
minimizer if and only if it minimizes iii{A,x). 

Let the configuration of an adversary be Ui = {mi, . . . , Uk}- We define: 

k 

<l>{Ui,Wi) = 'y' min^i(A,u/). 

^ ' A 
1=1 

Let the next request be and assume that the adversary moves the server 
from Uj to the request. The new adversary configuration is Ui — Uj + r^+i. The 
next lemma bounds the change in <P. 
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Lemma 3. For any configuration Ui, any Uj C Ui, and any ri+\ 

<F{Ui - Uj + r^+i,Wi) - F{U^,Wi) > -d{uj,ri+i). 

Proof. Let A be an arbitrary configuration that does not contain ri+i. We first 
show that there exists a configuration A' such that fj.i(A,n+i) > /j,i(A',Uj) — 
d(uj,r,+i'). 

If Uj ^ A then let A! = A. We have 

r*+i) = Wi{A) - ^ d(a, c) - d(c, ri+i) 

aeA 

> Wi{A) - ^ d(a, c) - d{c,Uj) - d{uj,ri+i) 
a£A 

= ufA'.Uj) - d{uj,ri+i). 

If Uj G A then let A' = A — Uj + ri+i. We have 

fii(A,n+i) = wfA) - ^ d{a,c) - d{c,ri+i) 

aeA 

= wfA)- ^ d{a,c) - d{c,Uj) - d{c,ri+i) 

a^A—Uj 

> wfA - Uj + ri+i) - d{uj,ri+i) 

— d{a,c) — d{c,Uj) 

= jj,i{A',Uj) - d{uj,ri+i). 

It follows that 

(l>{U^-Uj+ri+l,w^)-(l>(Ui,Wi) = nun/rj(A,rj+i)-nun^i(A,Mj) > -d{uj,ri+i). 

□ 



Lemma 4. For any configuration Ui+i that contains the last request n+i of 

Wi+l 

<P{Ui+i,Wi+i) - F{U^+i,Wt) > max{wi+i(X) - Wi(X)}. 

X 

Proof. Let be a minimizer of ri+i with respect to Wi that does not contain 
Ti+i- Then by the Duality lemma (Lemma P), B is also a minimizer of ri+i with 
respect to rci+i. From the monotinicity property of work functions we have: 



jii+i{A,ui) > yLr{A,ui) 
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for all A and ui. It follows that 



^(C/j+i, Wj+i) > min/ri+i(rj+i) - min n^in+i) 



A A 



= w^+i(B)~ -d(c,r^+i) 



- Wi(B) - -d(c,ri+i) 




= w^+i(B) - w^(B) 



The proof is complete, since by the Duality lemma: 



Wi+i(B) - Wi{B) = nmx{?iii+i(X) - ?«i(X)}. 



□ 



We can now combine the two above lemmata to get the main result of this 
section. 

Theorem 2. The work function algorithm is k-competitive for the weighted star. 

Proof. Let wq, Wn be the initial and final work functions, and [/q, Un be the 
initial and final adversary configurations respectively. 

Let EXT and OPT denote the total extended cost and the optimal offline cost. 
Combining Lemmas El and El we get that 

<T{Ui+i,Wi+i) - <P{Ui,Wi) > max{wi+i{X) - Wi{X)} - d{uj,u'j), 

where u' = r^+i. The distance d{uj,u'j) = d{Ui, Ui+i) is the cost of the adversary 
to service r^+i. 

Summing for all requests and assuming that the adversary moves optimally, 
we get 



Since = <P{Un,Wn) < k ■ w„(C/„), and <Po = <P{Uo,wo) = ~\Uo\ (the sum 
of the distances between all pairs of points in Uq), we obtain 



<P{Un,Wn) -<P{Uo, Wo) > EXT -OPT. 



EXT <d>n-d>0 + OPT < (A: + 1) • OPT + |C/o|- 



The total extended cost is bounded above by A: + 1 times the optimal cost plus a 
constant depending only on the initial configuration. We conclude that the work 
function algorithm is A;-competitive for weighted star metric spaces. □ 
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5 Metric Spaces with fc + 2 Points 

In this section, we show that the fc-server conjecture holds for metric spaces of 
k + 2 points. This result was first shown in mini, but we give a simpler proof 
here. As in mm, instead of studying the fc-server problem on fc + 2 points, it 
is simpler to consider the “dual” problem which is called the 2-evader problem. 
In the 2-evader problem, 2 evaders occupy distinct points of a metric space M 
of fc -h 2 points. The evaders respond to a sequence of ejections (requests) which 
is simply a sequence of points. If an evader occupies the point of an ejection, it 
has to move to some other point. The objective is to minimize the total distance 
traveled by the 2 evaders. 

The 2-evader problem is equivalent to the fc-server problem: servers occupy 
the points not occupied by evaders, and an ejection for the evaders is a request 
for the servers. This equivalence allows the theory of the /c-server problem and in 
particular the notion of the extended cost and the Duality lemma to be transfered 
to the evader problem. See UDI for a more extensive discussion of the evader 
problem and its equivalence to the /c-server problem. The extended cost is again 
equal to the maximum increase of the work function. The corresponding Duality 
lemma is: 

Lemma 5 (Duality lemma for the 2-evader problem). Assume that {x, y} 
minimizes the expression Wi{x,y)-\-d{ri+iTx)-\-d{ri+i^y). Then {x,y} minimizes 
also Wi+i{x,y) d{ri+i,x) d(ri+i,?/) and maximizes the extended cost: 

ma,x{w^+i{x,y) -w^{x,y)}. 
x,y 

As in the A:-server problem, a configuration {x,y} that minimizes Wi{x,y) -\- 
d{p,x) d{p,y) is called a minimizer of p with respect to Wi. It is not hard to 
show (see cm) that without loss of generality a minimizer of a point p contains 
p. In particular, a minimizer of is a configuration {ri+i,a;} that minimizes 
Wi{r^+l,x) d{ri+i,x). 

With the Duality lemma, we are ready to prove the main theorem of this 
section. We will make use of the following notational convenience: whenever we 
write w{x,y), we implicitly mean that x and y are distinct. 

Theorem 3. The WFA algorithm is k-competitive in every metric space of k-\- 2 
points. 

Proof. The argument again is based on a potential. We want to find a potential 
that “includes” a minimizer of r^+i. It is easy to see that the potential <Pi = 
mina;{u'i(a, a;) -I- d{a,x)} includes a minimizer of and can be used to 
prove that the WFA algorithm is (fc -I- l)-competitive. This follows from <Pi+i — 
T>i > m.in^{wi+i{r^+i,x) d{r^+i,x)} - m.in^{wi{ri+i,x) d{ri+i,x)}; by the 
Duality lemma, the last expression is equal to the extended cost to service r^+i. 
Clearly, the total extended cost is — ^o- Since is within a constant from 
{k 2)opt(p) and <Pq is constant, it follows that the WFA has competitive ratio 
at most fc -I- 1. 
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How can we alter to reduce the competitive ratio to fc? Let b\ and 62 
minimize w{x, y) + d(x^ y). The crucial observation is that {&i, 62} is a minimizer 
of both b\ and 62- Thus, the number of distinct minimizers is at most k + 1 . 
Equivalently, even if we subtract imnx^y{wi{x, y) + d{x, y)} from <Pi, the resulting 
expression still contains a minimizer for every point and in particular of Vi+i. 
This suggests the following potential: 

<Pi = y^ min{wi{a,x) + d{a,x)} - min{wi{x,y) + d{x,y)}. (1) 

X x,y 

a 

Notice that min2;{wi(a, cc) + d(a,x)} = Z)a/b2 a;) + 

d(a,a:)}. Since bi and 62 are distinct, at least one of them is not equal to 
without loss of generality, say bi ^ r^+i. By expressing 

<l>i = y min{u>i(a, a;) + d(o, a;)}, 

^ ^ X 

a^bi 

we observe that the sum includes the term corresponding to ri+i. For the po- 
tential d>i+i, we also get 

<Pi+i = min{wi+i(a,a;) -|- d{a,x)} - min{wj+i(a;, y) -|- d{x,y)} 

X x,y 

a 

> y^min{wi+i(a,a:) -|- d{a,x)} - min{w*+i(5i, y) -|- d{bi,y)} 

X y 

a 

= min{wi+i(g, a;) -I- d(a, x)}. 

a/hi 

Therefore, by subtracting, we get <Pi+i —<d>i> mina,{u'i+i(ri+i, a;) -|-(i(ri+i ,a;)} — 
mina,{r(;i(ri+i, a;)-|-d(ri+i, a;)} which is equal to the extended cost to service r^+i. 
By applying to d>i the same argument we used for we establish that the WFA 
algorithm is /c-competitive. 

Notice an important difference between the potential we use in this proof and 
the potential of cni: the potential here involves a max operator (the minus min 
part of dU). On the other hand, the potential of uni has only a min operator 
(and seems to be the minimal potential). □ 

6 Conclusions 

We showed that the wfa algorithm is fc-competitive for the line, the weighted 
cache and for all metric spaces of A: -I- 2 points. In all cases, we exploited the 
fact that the number of different minimizers is A: -I- 1 . This suggests that it may 
be worth investigating the cardinality of the set of minimizers for other special 
metric spaces, even for general metric spaces. Even if a metric space is guaranteed 
to have at most A; -|- 1 minimizers, we don’t know how to use this fact in general 
to establish that the wfa is A:-competitive for this metric space. Is there a simple 
sufficient condition for this? Finally, as an intermediate step towards establishing 
the A:-server conjecture, can we show that the wfa is A:-competitive for trees? 
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Abstract. A Boolean function 6 is a hard core predicate for a one-way 
function / if b is polynomial time computable but b{x) is difficult to 
predict from f{x). A general family of hard core predicates is a family 
of functions containing a hard core predicate for any one-way function. 
A seminal result of Goldreich and Levin asserts that the family of par- 
ity functions is a general family of hard core predicates. We show that 
no general family of hard core predicates can consist of functions with 
0(n^“') average sensitivity, for any e > 0. As a result, such families can- 
not consist of monotone functions, functions computed by generalized 
threshold gates, or symmetric d-threshold functions, for d = 
and e > 0. This also subsumes a 1997 result of Goldmann and Naslund 
which asserts that such families cannot consist of functions computable 
in AC°. The above bound on sensitivity is obtained by (lower) bounding 
the high order terms of the Fourier transform. 



1 Introduction 

A basic assumption on which much of modern (theoretical) cryptography rests 
is the existence of one-way functions. In general, such functions may have quite 
pathological structure, and the development of useful cryptographic primitives 
from general one-way functions (often with additional properties) is one of the 
triumphs of modern cryptography. One of the more troubling ways that a one- 
way function may be unsatisfactory is that it may “leak” information about x 
into /(a;); in particular, it may be possible to compute nearly all of x from f(x) 
in polynomial time. The problem of showing that f{x) hides at least one bit of 
information about x is the hard core predicate problem. Goldreich and Levin 
in a seminal 1989 paper, demonstrated that every one-way function has a hard 
core predicate. Specifically, they show that for any one-way function /, there is 
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a polynomial-time predicate bf so that hf{x) is difficult to compute from f{x). 
A hard core predicate, though a basic primitive, has remarkably potency: 

— If / is a permutation, a hard core predicate immediately gives rise to a 
pseudorandom generator. 

— If / is a permutation, a hard core predicate immediately gives rise to a secure 
bit-commitment scheme. 

— If / is a one-way trapdoor permutation, a hard core predicate for / imme- 
diately gives rise to a probabilistic encryption scheme (see [S|). 

— The Goldreich-Levin construction of a hard core predicate for any one-way 
function is an important ingredient in the proof that the existence of one-way 
functions implies the existence of pseudorandom generators 0. 

Considering their importance, attention has been given to how simple such 
predicates can be. A 1997 result of Goldmann and Naslund |21 shows that they 
cannot, in general, be computed in AC°. We strengthen this result, demonstrat- 
ing that, in general, hard core predicates must have a non-negligible portion of 
their Fourier transform concentrated on high-degree coefficients. From this it 
follows that such predicates 

— cannot have small average sensitivity (specifically, they cannot have average 
sensitivity 0{n^~'^) for any e > 0), 

— cannot be monotone, 

— cannot be computed by generalized threshold functions, and 

— cannot be computed by symmetric d-threshold functions, with d = 
for any e > 0. 

As mentioned above, this bound on the spectrum also implies that general hard- 
core predicates cannot be in AC°, as it is known that the Fourier transform of 
any AC° function is concentrated on coefficients of weight log*^*-^^ n |Tn|. It is 
interesting to note that these results parallel those for universal hash functions 
obtained by Mansour et. al. in [El- 

Section O defines the notions of one-way function and hard core predicate. 
Section 0 briefly erects the framework of Fourier analysis for Boolean functions. 
Sections 0 and 0 are devoted to proving the main theorem and discussing some 
applications. 



2 One-Way Functions and Hard Core Predicates 

A function /: {0, 1}* — > {0, 1}* is length preserving if /({0, 1}”) C {0, 1}" for 
all n. We write for / restricted to inputs of length n. For convenience, and 
without loss of generality, we restrict our attention to length preserving one-way 
functions: 

Definition 1. A (length-preserving) function f: {0,1}* — > {0,1}* is a one- 
way function if f is computable in polynomial time, and for all functions 
A: {0, 1}* — > {0, 1}* computable by polynomial-size circuits and for all k > 0, 
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Pr[/(A(/(")(x))) = /(")(x)] =0(n-'=) , 

where the probability is taken uniformly over all x G {0, 1}”. 

In this cryptographic setting we consider A to be a polynomially bounded ad- 
versary attempting to invert the function /. 

As discussed in the introduction, a hard core predicate for a one-way function 
/ is a polynomial time predicate b for which the value b{x) is difficult to predict 
from f{x). For reasons which will become clear later, it will be convenient for us 
to express Boolean functions as functions taking values in the set {±1}. 

Definition 2. The Boolean function b: {0,1}* — > {ilj a hard core predi- 
cate for a length-preserving one-way function f if b is computable in polynomial 
time and for all functions A: {0, 1}* — *■ {±1}, computable by polynomial- size 
circuits, and for all k > 0, 

Pr[A(/(")(x)) = 6(")(x)] =i + 0(n-'=) , 

this probability taken uniformly over all x G {0, Ij". 

For a more detailed discussion of one-way functions, hard core predicates, 
and their uses in modern cryptography, see 0 and El. 



3 Fourier Analysis of Boolean Functions 



Let L{Ij 2 ) = |/:Z 2 ^IR} denote the set of real valued functions on Z 2 = 
{0, Ij". Though our interest shall be in Boolean functions, it will be temporar- 
ily convenient to consider this richer space. ^(^ 2 ) i® ^ vector space over ffi of 
dimension 2", and has a natural inner product: for f,g G ^(Z^), we define 

a;G{0,l}" 



For a subset a C jl,-.. ,n}, we define Xa ■ {0,1}” ^ K so that Xa{x) = 
These functions Xa are the characters of Z 2 = {0,1}". Among 
their many wonderful properties is the fact that the characters form an or- 
thonormal basis for LlfLlf): 

Proposition 1. 



1. Vo C [n], ExGlo.i}" Xc.{x) 

2. Vo,/3 C [n], Xa{x)xfi{x) = 
difference of a and [3, and 



1 2" ifa = % 

1 0 otherwise, 

Xa(Bf}{x), where 0 0/3 denotes the symmetric 



3. Wa,P C [n], (xcnX/s) 



1 if a = (3 
0 otherwise. 
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Considering item 0, the characters {xa\o:C [n]} are orthogonal and have 
unit length. Since there are 2" characters, they span L(Z 2 ), as promised. Any 
function / : {0, 1}" ^ K may then be written in terms of this basis: / = 
SacN where /„ = (/, Xa) is the projection of / onto These coefficients 

fa, a C [n], are the Fourier coefficients of /, and, as we have above observed, 
uniquely determine the function /. 

Given the above, it is easy to establish the Plancherel equality: 

Proposition 2. Let f G L(Z^). Then \\f\\l = Y.afl> '^^ere H/H 2 = (/, /) = 

As always, fa, = Exp[/] and, when / is Boolean, J2af^ = Wf \\2 = ^- 

A prominent theme in the study of (continuous) Fourier analysis is local- global 
duality: 

. . . the speed of convergence of a Fourier series improves with the 
smoothness of /. This reflects the fact that local features of / (such as 
smoothness) are reflected in global features of / (such as rapid decay at 
n = ± 00 ). This local-global duality is one of the major themes of Fourier 
series and integrals, . . . 

-Dym, McKean, P p.31] 

This very same duality (between smoothness of / and rapid decay of /) shall be 
central for our study. In our framework, a natural measure of smoothness for a 
Boolean function / is its average sensitivity: 

Definition 3. The average sensitivity of a Boolean function f : {0, 1}" ^ {if} 
is the quantity 

= — X/ X/ - f{x®ei)\ 

^ xGlO,!}" j=i ^ 

where Ci G {0, 1}" denotes the vector containing a single 1 at position i and 0 
denotes coordinatewise sum modulo 2. (The ^ factor appearing in the last term 
here reflects our choice of {±1} as the range of our Boolean functions.) 

Let us look at some examples. The average sensitivity for the n-input parity 
function is n, since for any input x, flipping any of the n input bits will change 
the parity. The n-input OR function has average sensitivity 2n2“”: if the input 
is all 0, then flipping any bit changes the value of the function (for this input the 
inner sum is equal to n), for any of the n inputs with a single input bit being 1 
flipping that bit will change the value of the function (for each of these n inputs 
the inner sum is equal to 1), and for all inputs with at least two bits set to 1 the 
inner sum is equal to 0. 

Observe that average sensitivity is proportional to the likelihood that a ran- 
dom pair of neighboring points take on different values: “smooth” functions, 
where neighboring points are likely to agree, should have small average sensitiv- 
ity. Functions with small average sensitivity are likely to have the same value on 
similar looking inputs-it is this property we shall exploit. 
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The connection between average sensitivity (smoothness) and rapid decay of 
the Fourier transform is given by the following equality, due to Kahn, Kalai, and 
Linial 0: 



s(/) = Ei“i^- (1) 

Ct 

Considering the above equality, and recalling that II/H 2 = 1 for a Boolean func- 
tion /, the average sensitivity of / is exactly determined by the distribution of 
this unit mass among the terms This is a manifestation of the local-global 
duality principle mentioned above: functions having their Fourier transform con- 
centrated on small coefficients (those for which |o;| is small) have small average 
sensitivity and, as such, are smooth. In this case, we opt to define our notion of 
smoothness in terms of the Fourier transform as follows: 

Definition 4. We say that a function f : {0, 1}" — > {il} {t, S)-smooth iff 

i“i>* 

A function g: {0,1}* — s- {±1} is {t {n), 6 {n))~ smooth iff there exists no > 0 so 
that for all n > no is {t{n), 6{n))-smooth. 

A final word on notation: we will frequently study functions f{x, y) that take 
two strings x, y as input. Using and ly as the (disjoint) index sets for x and y 
respectively, it will be convenient to index the Fourier coefficients of / with two 
sets a, (3, where a C Ix and (3 C ly'. 

f{x,y) = E fa,0^auf3{x,y) = ^ fc,,aXa{x)Xo{y) , 

aC/x aC/x 

!3<Zly 0<zly 

where Xaufsix, y) = Xa{x)X(j{y) since a n /3 = 0. 

4 Main Result 

We can now begin working toward our main result which asserts that if one- 
way functions exist, then there are one-way functions for which every hard core 
predicate is highly non-smooth. In sectional we explore the consequences of this 
theorem for general hard core predicates. 

The following theorem implies the bound on sensitivity claimed in the intro- 
duction. 

Theorem 1. If there exists a one-way function, then for every e > 0 there 
is a one-way function f^ such that no (jyn^~ '^,6) -smooth Boolean function 
b: {0, 1}* — > {if} can be a hard core predicate for if j -\- 6 < 1/16. 
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Proof. Let g : {0, 1}* ^ {0, 1}* be a (length preserving) one-way function. 
Fix an arbitrary constant e > 0. We describe below a one-way function 
/<; = f{x,y) of n variables so that if b^'^\x,y) = a,(}Xa{x)Xij{y) and 

^ fi.x,y), One can guess b^'^\x,y) with 

high probability. We will henceforth abandon the superscript on b whenever 
convenient. 

Assume that n is even and define f{x,y) where \x\ = \y\ = n/2 as follows. 
For an element w G {0,1}^ and a subset S = {si,... ,s/} C {I,-- - ,k}, let 
ws = Wsi ■ ■ -Wsi, where si < • • • < s;. The input x is divided into tx n^~‘^ 
blocks, each consisting of Wx{n) =^n'^/2 bits. Similarly, y is divided into ty 

log blocks Bi,... ,Bty, each consisting of Wy(n) =^n/(21ogn^“*^) bits. For 
simplicity, we ignore issues of integrality for these quantities. Then the value 
/(x, y) is computed as follows. Write y = ysi ■ ■ • ysty and let Ji = Vk be 

the parity of the bits in j/s. ; interpret the result J\, . . . , Jt^ as a binary coded 
integer J{y) G {O, . . . , — l}. 

We define the set 

•J{y) = + I.-- - > [-^(y) + l]y - l| ; 

these are precisely the indices of the J(y)th block of x. Finally, define f{x,y) = 
{z,y) where Zi = Xi when i ^ J{y) and zj(y) = g{xj(y)). Clearly / is a one-way 
function, since inverting / in polynomial time implies inversion of g on n-bit 
inputs in time polynomial in (2n)^/*^. 

Now, let b : {0, 1}* ^ {±1} be a <5)-smooth Boolean predicate. Our 

goal is to show that b cannot be a hard core predicate for /. For the remainder 
of this section, fix the input length to n, an integer large enough so that 6^”^ is 
(5)-smooth and 4y < logn^“*^. 

The following lemma then implies the theorem. 

Lemma 1. If b'^'^\x,y) is , 6) -smooth, then there is a probabilistic poly- 

nomial time algorithm Ab such that 

Pr[Afc(/(a;, y)) = b{x, y)] > 1 - 8(y -k i5) , 

this probability taken over uniformly random choice of x and y and the coin 
tosses of Ab- 



Remark 1. Hard core predicates are defined with respect to adversaries that 
are polynomial-size circuits. However, since probabilistic polynomial-time al- 
gorithms are less powerful than polynomial-size circuits, the above lemma is 
sufficient. 

Describing Ab is simple. Given f{x,y), y, and hence J{y) is known. Also, all of 
X except xj(y) is known. Form x' by letting x[ = Xi when i ^ J{y), and picking 
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uniformly at random. Finally, let Ab{f{x,y)) = b{x',y). The guess is cor- 
rect when b{x,y) — b{x',y). Note that x' and y are independent and uniformly 
distributed. However, x and x' are of course highly dependent. 

In this case. Lemma [D follows from Lemma E| below. 



Lemma 2. Ifb{x, y) is 5)-smooth, and x, x', y are generated as described 

above, then Pr[6(a;, y) = b{x' , y)] > 1 — 8(7 -I- (5) . 

Let Z be the indicator function 



and define 



Z{A,B) 



1 if A n H 0, and 
0 otherwise. 



e{x,y)= ba,/3Xa{x)Xf3{y){l- Z{a,J{y))) , 

h{x,y)= ^ ba,f 3 Xa{x)Xf 3 {y)Z{a, J{y)) , and 

r{x,y) = ^ ba,f3Xa{x)Xf3{y) . 

\a\ + \/3\>jn^-'‘ 



Now, b{x, y) = e(x, y) + h{x, y) + r{x, y), and note that e{x, y) depends only on 
inputs exposed by f{x,y) (i.e., it does not depend on xj(y)), whereas each term 
in h{x,y) depends on some hidden bits (i.e., bits in Xj(^y)). 

Observe that when x,x',y are generated according to the above procedure, 
e{x,y) = e{x',y). We will prove that for random (x,y), with high probability 
both \r{x, y) \ and \h{x, y) \ are small — this is enough to prove that with high prob- 
ability b{x,y) = b{x',y). The contributions of r{x,y) and h{x,y) are bounded 
by the following two lemmas. 

Lemma 3. If b{x,y) is -smooth, and x,y are uniformly distributed, 

then 



Pr[|r(a;,y)| > A] < A '^5 . 

Proof. We bound the probability using a special case of the Chebychev inequal- 
ity: for a real-valued random variable X such that Exp[X] = 0, 

Pr[|X - Exp[X]| > A] < A"2 Exp[X^] . (2) 

As Exp[r(a;, y)] = 0, we have by linearity of expectation 

Pr[\r{x,y)\ > A] < A"^ ^ Exp[{xa{x)xp{y)f] < A"^^ ■ 



□ 
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Lemma 4. If 4j < n*^/logn^“'^ then Pr[|/i(a;, y)| > A] < A“^7- 

The proof of Lemma 0 is slightly technical, so let us first see that Lemmas 0 
and 21 together imply that with high probability b{x,y) — b{x' ,y). 

Proof (of Lemma\^. Since \b{x,y)\ = \b{x' ,y)\ = 1 it is enough to show that 
with high probability \b{x^y) — b{x' ,y)\ < 2. 

Considering that e{x,y) = e{x' ,y), applying the triangle inequality we have 

\b{x,y) - b{x',y)\ < \r{x,y)\ + \r{x',y)\ + \h{x,y)\ + \h{x',y)\ . 

Therefore, 

Pr [\b{x,y) - b{x\y)\ > 2] < Pr[|r(x,?/)| > 1/2] + Pr[\r{x\y)\ > 1/2] + 

Pr[\h{x,y)\ > 1/2] + Pr[\h{x',y)\ > 1/2] 

< 85 + 87 . 

The last inequality follows by two applications of Lemma 0 and two applications 
of Lemma 2] (with A = 1/2). □ 

As mentioned above, from Lemma 0 follows Theorem □ □ 

It remains to prove Lemma 0 

Proof (Proof of Lemma W- By linearity of expectation and independence of x 
and y, 



Exp[h{x,y)]= ba,h^M^a{x)]Exp[Z{a,J{y))Xf3{y)]- 

l“l + l(S|<7n^“' 



Now, Z{^,J{y)) = 0 for all y, and when a 0 then Exp[Xa(a;)] = 0; hence each 
term in h{x,y) has expectation 0 and Exp[h(a;, y)] = 0. By 10), it follows that 

Pr[\h{x,y)\ > A] < A"^ Exp[{h{x , y)f] . 

Expanding Exp[{h{x , y))^] yields the expression 

Y ba,i3ba',/3'Pxp{Xc,(Sa'ix)]Exp[X/3(si3'iy)Z{a,J{y))Z{a',J{y))] . 
\a\ + \h\<'rn'^~‘ 

|“'| + |/5'|<7n^“' 



As Exp[x„(a:)Xa'(a;)] = Sa,a', we have 

Pr[\h{x,y)\ > A] < A"^ Y ^a,/3^a,/3' Exp[Xf 3 (Bi 3 '{y)Z{a,J{y))] . (3) 

H+\l3\<W-‘ 

|a| + |/3'|<7'i^“' 



Accept, for the moment, the following claim. 
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Claim. //max(|/ 3 | , |/ 3 '|) < 771^ % then Exp[Xf3(B f3'{y)Z{a,J{y))] < 7 <^/ 3 ,/ 3 '- 
From the claim and o we have 

Pr[\h{x,y)\> X]< ^ 7 < • 

The last inequality follows from the Plancherel equality. 

It remains to prove the claim. First note that X/3® f3'{y) and Z{a,J^{y)) are 
independent when max(|/ 3 | , \(3'\) < 771^“*^. This follows from the fact that each 
’bit’ in J{y) is the parity of 77/(2 log bits, whereas X/3© 0'{y) only depends 
on the parity of |/3 0 j3'\ < bits. Thus, even when all the bits in /3 0 /?' 

are fixed, the bits of J{y) are still uniformly and independently distributed. As 
Exp[X/3 © /?'(?/)] = <5/3, /j/, it remains to show that Exp[Z{a, J{y))] < 7, but this 
follows from the fact that Exp\Z{a, J{y))] is the probability that a randomly 
picked block intersects the fixed set a. This is bounded from above by |a| 
and as |o;| < 777^“*^, we are done. □ 

5 General Families of Hard Core Predicates 

In this section we consider a slightly different definition for the concept of a 
hard core predicate. We require only that they are computable in non-uniform 
polynomial time (that is, computable by polynomial-size circuits). The reason for 
considering this weaker definition is that rather than focusing on the behavior 
of a single predicate, we wish to explore families of predicates guaranteed to 
contain a hard core predicate for any one-way function /. Such families are 
called general families of hard core predicates. Typically, as in the Goldreich- 
Levin construction, a randomly chosen member of the family of predicates is 
likely to be a hard core predicate for /, and folding this “random choice” into 
the definition of b (naively) requires non-uniformity. This does not greatly affect 
the results in the previous section. The only difference is that the algorithm Af, 
requires a (polynomial-size) circuit for the predicate b so that it can evaluate b 
on an input x,y. 

Definition 5. A family B C {± 1 }^°’^^ is called a general family of hard core 
predicates if for every one-way function f there is a (non-uniform) polynomial 
time computable predicate b G B, such that b is a hard core predicate for f. 

The theorem of Goldreich and Levin mentioned in the introduction asserts 
that the collection of functions 



eGL = {p:{0,ir^{±l} 



V77,p(") = Xo 



^,for some C { 1 , . . . ,77}| 



is a general family of hard core predicates. 

One consequence of the theorem of the last section is that general families of 
hard core predicates cannot be smooth: 
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Corollary 1. If B is a general family of hard core predicates, then, for every 
e > Q, it must contain a function which is not 1/17) -smooth. 

The close connection between smoothness and average sensitivity implies the 
following. 

Corollary 2. If B is a general family of hard core predicates then, for every 
e > 0, B must contain a function with average sensitivity greater than for 
all sufficiently large n. 

Proof sketch. The lower bound on the sensitivity follows from equation © cou- 
pled with the lower bound on smoothness. If S{b) < n^~^, then for e' < e, b is 
{n^~^ ~'^)-smooth. □ 

A celebrated theorem of Linial, Mansour, and Nisan shows that functions in 
AC° are smooth: 



Theorem 2 Let f : {0,1}* — > {±1} be a Boolean function with poly- 
nomial-size constant depth circuits. Then f is n,o{l)) -smooth. 

An immediate corollary is a theorem of Goldmann and Naslund 0 asserting 
that a general family of hard core predicates must contain predicates outside 
AC°: 



Corollary 3. If B is a general family of hard core predicates, then it must con- 
tain a function which is not in AC*^ . 

It is interesting to note the folklore theorem jOj which asserts that any mono- 
tone function / has small average sensitivity: 

Lemma 5. Let f be a monotone Boolean function, then S(/) = 0(^/rT). 

Clearly, the same bound holds for any generalized monotone function (a general- 
ized monotone function is obtained by negating some of the inputs to a monotone 
function) . In light of the above, the following is immediate: 

Corollary 4. If B is a general family of hard core predicates, then it must con- 
tain a non-monotone function. 

A Boolean function / : {0, 1}” — > {±1} is a d-threshold function if there 
exists a real multivariate polynomial p G K[a;i , . . . , a;„] of total degree d or less 
so that V(xi, . . . , Xn) & {0, Ij", f{xi, . . . , Xn) = signp(a;i, . . . , a;„). When d = 1 
such functions are generalized threshold functions and are generalized monotone 
functions; their average sensitivity is addressed in Lemma El above. 

In general, it has been shown by Gotsman and Linial jSI that d-threshold 
functions are (d, 1 — ed)-smooth, for a constant Cd > 0 independent of n. Though 
this is not strong enough for our application, they show that under the added 
assumption that / is symmetric, one has 



d-l 



S(/)<2-"+i^ 






n 

[{n-k)/2] 



n — 



n — k 



624 



Mikael Goldmann and Alexander Russell 



where [x] is the integer part of x. Observe that when d = 0{n^ this quantity 
is Then the following is immediate. 

Corollary 5. If B is a general family of hard core predicates, then it must con- 
tain a function which, for large enough n, cannot be expressed as the sign of a 
symmetric polynomial of degree d = for any e > 0. 



6 A Remark on the Uniform Version of the 
Goldreich— Levin Theorem 



There is also a uniform perspective on the Goldreich-Levin construction. Given 
any one-way function / we construct a new one-way function g f (defined on even 
length inputs) where 

gf ^^ : {0, 1}" X {0, 1}" {0, 1}" x {0, 1}" . 

Let an n-bit string y encode a set a{y) C {1, ■ . ■ , u} in the natural way. Define 

gf"'\x,y) = if{x),y) 



Then, for any one-way function /, the predicate 6 ql is a hard-core predicate 
for gf. Note that this is a uniform construction, and that the predicate 6ql is 
independent of /. 

A natural way to generalize this would be to consider a construction, where 
for every one-way function f{x) there is a padded version gf{x,y) = (f{x),y) 
where |?/| = _p(|a;|) for some polynomial p and a predicate b such that b{x,y) is 
a hard core predicate for gp. What kind of lower bounds can one show for, for 
instance, the sensitivity of hi 

Our results extend to this notion of a general hard core predicate simply by 
observing that if 6 is a hard-core predicate for gp for all /, then the family 



{c: {0,ir 



{± 1 } 



c^"^(x) = b^’^~^^^"^^(x,yo) for some yo € {0, 



is a general family of hard core predicates. It follows for example that b cannot be 
monotone and cannot be computable by AC°-circuits. The results on sensitivity 
also carry over, but observe that the parameter n refers to the length of the 
input X, not to the combined length of x and y. 



7 Conclusion and Open Questions 

The results presented here indicate a certain degree of optimality on behalf of 
the Goldreich-Levin construction (see also 0). (Observe that with probability 
1 a function selected from Bcl will have linear average sensitivity.) Also, it 
suggests a connection between families of universal hash functions and general 
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hard core predicates. On the one hand, several well-known examples of universal 
hash functions have been shown to be general hard core predicates mm, 
and on the other hand, smooth functions make poor hash functions as well as 
poor hard core predicates. An interesting (and very open-ended) problem is to 
determine if there is a nice connection between universal hash functions and 
hard core predicates. 
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Abstract. A visual cryptography scheme for a set V of n participants 
is a method to encode a secret image into n shadow images called shares 
each of which is given to a distinct participant. Certain qualified subsets 
of participants can recover the secret image, whereas forbidden subsets of 
participants have no information on the secret image. The shares given 
to participants in A C "P are xeroxed onto transparencies. If X is quali- 
fied then the participants in X can visually recover the secret image by 
stacking their transparencies without any cryptography knowledge and 
without performing any cryptographic computation. 

This is the first paper which analyzes the amount of randomness needed 
to visually share a secret image. It provides lower and upper bounds 
to the randomness of visual cryptography schemes. Our schemes repre- 
sent a dramatic improvement on the randomness of all previously known 
schemes. 



Keywords: Cryptography, Randomness, Secret Sharing, Visual Cryptography. 

1 Introduction 

A visual cryptography scheme (VCS) for a set V oin participants is a method to 
encode a secret image into n shadow images called shares each of which is given 
to a distinct participant. Certain qualified subsets of participants can recover 
the secret image, whereas forbidden subsets of participants have no information 
on the secret image. The specification of all qualified and forbidden subsets of 
participants constitutes an access structure. The shares given to participants in 
X C V are xeroxed onto transparencies. If X is qualified then the participants in 
X can visually recover the secret image by stacking their transparencies without 
any cryptography knowledge and without performing any cryptographic com- 
putation. The definition of visual cryptography scheme was given by Naor and 
Shamir in PI- They analyzed (fc, n)-threshold visual cryptography schemes, that 
is schemes where any subset of k participants is qualified, whereas groups of less 
than k participants are forbidden. The model by Naor and Shamir has been 
extended in m to general access structures. 

All previous papers on visual cryptography mainly focus on two parameters: 
the pixel expansion, which represents the number of subpixels in the encoding of 
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the original image, and the contrast, which measures the “difference” between a 
black and a white pixel in the reconstructed image. In particular, several results 
on the contrast and the pixel expansions of VCSs can be found in 



This is the first paper which analyzes the amount of randomness needed 
to visually share a secret image. Random bits are a natural computational re- 
source which must be taken into account when designing cryptographic algo- 
rithms. Considerable effort has been devoted to reduce the number of bits used 
by probabilistic algorithms (see for example [ 1 2j ) and to analyze the amount of 
randomness required in order to achieve a given performance. Motivated by the 
fact that “truly” random bits are hard to generate, it has also been investigated 
the possibility of using imperfect source of randomness in randomized algorithms 
m- The amount of randomness used in a computation is an important issue in 
many practical applications. Suppose we want to secretly share an image among 
four participants in such a way that groups of at most three participants have 
no information on the secret image. The previously known VCS for such access 
structure is due to Naor and Shamir na and uses log(8!) ~ 15.3 random bits 
per pixels. In this paper we will present a minimum randomness VCS for that 
access structure which uses only 3 random bits per pixels. For typical images of 
tens of thousands or hundreds of thousands of pixels, the difference between the 
randomness of these two VCSs considerably affects the time and space needed 
for the encoding. Randomness has played a significant role in a cryptography 
project recently realized at the University of Salerno. The main goal of such 
project was the creation of a web server which implements visual secret scheme 
for arbitrary access structures. To speed up the share generation, the random 
bits are produced beforehand and stored in a file. Because of space limitations, 
the file is dynamically managed in such a way that it contains enough random 
numbers to satisfy a certain number of client requests. Since the file management 
is very time consuming and adds a considerable overhead to server computation, 
then it follows that the amount of randomness involved in the share generation 
greatly affects the efficiency of the server. 



This paper provides lower and upper bounds to the randomness of visual 
cryptography schemes. Our schemes represent a dramatic improvement on the 
randomness of all previously known schemes. 

Outline of the Paper The model we employ to secretly share an image among 
n participants is described in Section 2. In Section 3 we provide a simple tech- 
nique to obtain lower bounds on the randomness of any VCS and derive a lower 
bound on the randomness of {k, n)-threshold VCSs, for any 2 < fc < n. In Sec- 
tion 4 we give a complete characterization of {k, fc)-threshold VCSs with both 
minimum randomness and minimum pixel expansion. Section 5 deals with visual 
cryptography schemes for general access structures and provides tools to derive 
upper bounds on the randomness of VCSs for any access structure. In Section 6 
we provide a technique to construct minimum randomness (2, n)-threshold VCSs 
for any n > 2, and a technique to construct (fc, n)-threshold VCSs for any value 
of fc and n, with n > k > 2, which dramatically improves on the previously 
known constructions with respect to the randomness. 
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2 The Model 

Let 7^ = {1, . . . , n} be a set of elements called participants, and let 2^ denote the 
set of all subsets of V. Let L’Quai C 2^ and Tporb C 2^, where L’Quain/^Forb = 0- We 
refer to members of Iquai as qualified sets and we call members of Iporb forbidden 
sets. The pair (Iquai, Tporb) is called the access structure of the scheme. 

Let To consist of all the minimal qualified sets: Tg = {A G Tquai : A' ^ 
Tquai for all A' C A\. A participant T G T is an essential participant if there 
exists a set A C T such that X U {T} G Tquai but X ^ Tquai- A non-essential 
participant does not need to participate “actively” in the reconstruction of the 
image, since the information he has is not needed by any set in V in order to 
recover the shared image. In any VCS having non-essential participants, these 
participants do not require any information in their shares. If a participant P 
is not essential then we can construct a visual cryptography scheme giving him 
nothing as his share or, as we will see later, a share completely “white”. 

In the case where Tquai is monotone increasing, Tporb is monotone decreasing, 
and Tquai UTporb = 2^, the access structure is said to be strong, and Tq is termed 
a basis. (This situation is the usual setting for traditional secret sharing.) In a 
strong access structure, Tquai = {C C P : B C C for some B G Tq}, and we say 
that Tquai is the closure of Tq. 

Notice that if a set of participants A is a superset of a qualified set A', then 
they can recover the shared image by considering only the shares of the set A'. 
This does not in itself rule out the possibility that stacking all the transparencies 
of the participants in A does not reveal any information about the shared image. 

A (fc, n)-threshold structure {Pquai, PForb) on a set T of n participants is any 
access structure in which Pq = {B CP : \B\ = k} and Pporb = {T C T : |T| < 
k}. A VCS for a (I, n)-threshold structure is called (fc, n)-threshold VCS. 

We assume that the message consists of a collection of black and white pixels. 
Each pixel appears in n versions called shares, one for each transparency. Each 
share is a collection of m black and white subpixels. The resulting structure can 
be described by an n x to boolean matrix M = [to^] where to^ = 1 iff the j-th 
subpixel in the i-th transparency is black. Therefore the grey level of the com- 
bined share, obtained by stacking the transparencies i\,. . . ,is, is proportional 
to the Hamming weight w{V) of the to- vector V = OR{Ri^, . . . , Ri^) where 
, . . . , Ri^ are the rows of M associated with the transparencies we stack. This 
grey level is interpreted by the visual system of the users as black or as white 
according with some rule of contrast. 

Definition 1. Let (Tquai, Tporb) be an access structure on a set ofn participants. 
Two collections (multisets) of n x m boolean matrices Cq and C\ constitute a 
visual cryptography scheme (Tquai, TForb)-VCS if there exist a value a{m) and a 
collection {(A, tx)}jC6rQ„,, satisfying: 

1. Any (qualified) set A = {zi, i 2 , . . . , ip} G Tquai can recover the shared image 
by stacking their transparencies. 

Formally, for any M G Cq, the “or” V of rows zi, Z 2 , . . . , Zp satisfies wfV) < 
tx — a{m) • to; whereas, for any M G Ci it results that wfV) > tx- 
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2. Any (forbidden) set X = {ii,i 2 , ■ ■ ■ ,ip} € Pforb has no information on the 
shared image. 

Formally, the two collections of p x m matrices T>b, with b G {0, 1}, obtained 
by restricting each n X m matrix in Cb to rows i\,i 2 , ■ ■ ■ ,ip are indistin- 
guishable in the sense that they contain the same matrices with the same 
frequencies. 

Each pixel of the original image will be encoded into n pixels, each of which 
consists of m subpixels. To share a white (black, resp.) pixel, the dealer randomly 
chooses one of the matrices in Co (Ci, resp.) and distributes row i to participant 
i, for i = 1 , . . . ,n. 

The first property of Definition 0 is related to the contrast of the image. 
It states that when a qualified set of users stack their transparencies they can 
correctly recover the shared image. The value a(m) is called relative difference 
and the number 7 = a(m) -m, which is assumed to be an integer, is referred to as 
the contrast of the image. The set {(A, tx)}x^rQ„„\ is called the set of thresholds 
and tx is the threshold associated to A G /quai. We want the contrast to be as 
large as possible and at least one, that is, a{m) > 1/m. The second property 
is called security, since it implies that, even by inspecting all their shares, a 
forbidden set of participants cannot gain any information in deciding whether 
the shared pixel was white or black. 

The model of visual cryptography we consider is the same as that described 
in m- This model is a generalization of the one proposed in P!> since with 
each set A G Fq^ai we associate a (possibly) different threshold tx- Further, the 
access structure is not required to be strong in our model. 

Notice that Cq (Ci) is a multiset of n x m boolean matrices, therefore we 
allow a matrix to appear more than once in Cq (Ci). Moreover, the size of the 
collections Cq and Ci does not need to be the same. 

The randomness of a visual cryptography scheme represents the number of 
random bits per pixel used by the dealer to share an image among the parti- 
cipants. The definition of randomness for secret sharing schemes has been in- 
troduced in 1^. Since visual cryptography schemes are a special kind of secret 
sharing schemes, then, following 0, we define the randomness of a VCS rea- 
lized by Co and Ci as 7 ?,(CoTi).p = plog|Co| -I- (1 — p)log|Ci|, where p denotes 
the probability (frequency) of the white pixels in the image to be encoded. Let 
r = (/Quai, Tporb) be a given access structure. In accordance with ^j, the ran- 
domness of the access structure F is defined as Tip = 7 where 

A denotes the set of all pairs of collections Cq and Ci realizing a VCS for F, 
and X = [0, 1] is the range of all values of the probability p. This definition 
is equivalent to the following TZp — min _4 log(min{|Co|, |Ci |}). The above de- 
finition implies that, given a pair of matrix collections Cq and Ci realizing a 
VCS for the access structure F, we are mainly concerned with the quantity 
log(min{|Co|, |Ci|}). Hence, we define the randomness TZ{Co,Ci) of a VCS rea- 
lized by Co and Ci as TZ{Co,Ci) = log(min{|Co|, |Ci|}). We point out that all 
VCSs presented in this paper are realized by equal sized matrix collections Cq 
and Cl. As a consequence, all our upper bounds hold even if we alternatively 
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define the randomness of the VCS to be the logarithm of the maximum of |Co| 
and \Ci\. Notice that the randomness of any VCS is at least one in that the 
share assigned to any essential participant has to be chosen in a set of size at 
least two. 

Observe that any subset of a forbidden subset is forbidden, so Cporb is nec- 
essarily monotone decreasing. Moreover, it is easy to see that no superset of a 
qualified subset is forbidden. Hence, a strong access structure is simply one in 
which iQuai is monotone increasing and Fq^ai U Cporb = 2^. 

Notice also that, given an (admissible) access structure (Cguai, Cporb), we can 
“embed” it in a strong access structure in which Fq^ai C and 

Cporb ^ -^Forb- (^^Quai > -^Forb ) the Strong access 

structure whose basis Fq consists of the minimal sets in Iquai. 

In view of the above observations, upper bounds on the randomness of VCSs 
for a strong access structure extend also to the access structures which can be 
embedded in it. 

Most of the VCSs presented in literature !TI2m can be represented by 
means of two nxm characteristic matrices, and , called basis matrices. The 
collections Cq and Ci are obtained by permuting the columns of the corresponding 
basis matrix (5'° for Cq, and for Ci) in all possible ways. Hence, the collections 
Co and Ci have both size equal to ml. The algorithm for the VCS based on the 
previous construction of the collections Cq and Ci has small memory requirements 
(it keeps only the basis matrices and S”^) and it is efficient (to choose a matrix 
in Co (Cl, resp.) it only generates a permutation of the columns of resp.)). 

3 Lower Bounds on the Randomness of VCSs 

The following theorem is a useful tool to derive lower bounds on the randomness 
of VCSs for any access structure. 

Theorem 1. Let Cq and Ci realize a VCS for the access structure (Fq^ai, Fforb) 
on a set of participants V. Let G be a subset of Fforb with the property that 
for any pair of distinct sets A,B G G, there exists a set C G Iquai such that 
G C AUB. For any i € V, let Gi = {A G G : i G A} and let di = |{i? : = 

R for some M G Co}| > 2. Then, Cq and C\ have both size larger than or equal 
to max{ I G I , max^ €:v{di ■ |Gi|}}. 

As an application of the above theorem, we derive a lower bound on the size of the 
matrix collections realizing a VCS for a (fc, n)-threshold structure {Fquai, Fporb)- 
Let G denote the family of all subsets of V of size fc — 1. It is G C Tporb- 
Moreover, for any A, B G G, with A ^ B, one has that AU B contains at least 
a subset of V of size k. Hence, G satisfies the hypothesis of Theorem [D Let 
Gi = {A G G : i G A}, for i = 1, . . . , n. It is |Gi| = i = 1, ... ,n, let di 

be defined as in Theorem [D Then, Theorem [H implies that Cq and Ci have both 
size larger than or equal to both (^Zi) and (maxig{i_ „j.{di})(^Z 2 ) — 

The following theorem provides a better lower bound on the size of the col- 
lections Cq and Ci realizing a (fc, n)-threshold VCS. 
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Theorem 2. LetCo andCi he two matrix collections realizing a (k,n) -threshold 
VCS, with n > k > 2. Co and Ci have both size larger than or equal to (n — 
k 2)^~^ . Consequently, the randomness of a (k,n) -threshold VCS is at least 
(fc- l)log(n-/fc + 2). 

4 Minimum Randomness (k, fc)-Threshold VCSs 

In this section we give a characterization of {k, fc)-threshold VCSs, for fc > 2, 
with both minimum randomness and minimum pixel expansion. Notice that from 
Theorem El it follows that two collections Cq and Ci realizing a (A:, fc)-threshold 
VCS have both size at least 2^~^. Hence, the randomness of a (A:, fc)-threshold 
VCS is at least fc — 1. 

In the following we provide a construction for (fc, fc)-threshold VCSs with 
minimum randomness fc — 1. In these (fc, fc)-threshold VCSs each participant is 
assigned two row vectors as share. 

For any i = 1, . . . , fc, let Ri and Ri denote two row vectors and let v be a 
fc-entry binary vector. We denote with M(v, Ri, . . . , Rk, Ri, ■ ■ ■ , Rk) the fc-row 
matrix whose i-th row, i = 1, . . . , fc, is equal to Ri if the z-th entry of v is 0, and 
to Ri otherwise. 

In the following we will refer to columns having even weight as even columns 
and to those having odd weight as odd columns. 

We will denote with Aih,k, for k >2 and h > 1, a k x h2^~^ matrix which 
contains all even columns with multiplicity h and no odd column. The following 
theorem holds. 

Theorem 3. For any k >2 and h > 1, let Ri be the i-th row of Aih,k 
and let Ri denote the bitwise complement of Ri. The matrix collections 
Co = {M(v, Ri, . . . , Rk, i?i, . . . , Rk) '■ w{v) is even} and Ci = {M(v, Ri, . . . , 
Rk, Ri, ■ ■ ■ , Rk) ■ wfv) is odd} realize a {k, k) -threshold VCS with pixel expan- 
sion h2^~^ , relative difference a{m) = 1/2^“^ and minimum randomness fc — 1. 

In order to assign the shares to each participant, the dealer does not have to 
construct all matrices of the collections Cq and Ci of Theorem 0 Alternatively, 
every time a white pixel has to be shared among the fc participants, the dealer 
randomly chooses 

f a fc-entry binary vector v of even weight if the pixel is white 

1 a fc-entry binary vector v of odd weight if the pixel is black, 

and for z = 1, . . . , fc, selects the z-th row of M(v, i?i, . . . , Rk, i?i, . . . , Rk) as share 
for participant i. 

The following lemma provides a first characterization of a minimum random- 
ness (fc, fc)-threshold VCS. 

Lemma 1. Let Co and Ci be two matrix collections realizing a minimum ran- 
domness (fc, k)-threshold VCS, k >2. Then, for any z = 1, . . . , fc, the set Ri = 

{i? : there is M G Cq U Ci such that the i-th row of M is i?} consists of only 

two row vectors. 



632 



Annalisa De Bonis and Alfredo De Santis 



Let G Co and let Rj = {Ri^Ri}, with Ri being the i-th row of , for i = 
I, . . . ,k. Then, it results Co = {M(v, i?i, . . . , Rk, Ri , . . . , Rk) '■ wfv) is even} 
and Cl — {M(v, Ri, . . . , Rk, Ri, ■ ■ ■ , Rk) ■ w(v) is odd}. 

In order to characterize the structure of a minimum randomness (k, fc)-threshold 
VCS, we fix the value of the contrast 7 = a(m) ■ m. Recall that such a quantity 
measures the “difference” between a black and a white pixel in the reconstructed 
image. For any k >2 and 7 > 1 , we will denote with m*{k, 7) the pixel expansion 
of the (fc, A:)-threshold VCS with smallest pixel expansion among those having 
contrast 7. We will show that, for any given value of the contrast 7, the con- 
struction of Theorem 0 is the only one providing a (/c, fc)-threshold VCS with 
minimum pixel expansion m*{k,j) = -y2^~^ and minimum randomness k — 1 . 
First we show that in any (k, fc)-threshold VCS with contrast 7 and with mini- 
mum pixel expansion m*{k, 7), each matrix of Co contains all even columns with 
multiplicity 7, whereas each matrix of Ci contains all odd columns with the same 
multiplicity. 

Theorem 4. Let 7 > 1 , and let Co = {M}, . . . , M^^} and Ci = {M}, . . . , Mf^} 
be two matrix collections realizing a (k, k) -threshold VCS , k > 2, with contrast 
7 and with minimum pixel expansion m*(k,"f). Each matrix in Co consists of 
all even columns each occurring with multiplicity 7, whereas each matrix in Ci 
consists of all odd columns each occurring with the same multiplicity 7. Conse- 
quently, m*{k,'y) = 72 ^“^. 

From Theorem 0 it follows that, for any given value of the contrast 7, the (fc, k)- 
threshold VCS of Theorem 0 is optimal with respect to the pixel expansion. 
Moreover, for any (fc, fc)-threshold VCS with both minimum randomness and 
pixel expansion m*{k,"f) one has that the following theorem holds. 

Theorem 5. Let Co and C\ be two matrix collections realizing a minimum ran- 
domness {k, k) -threshold VCS, k >2, with contrast 7 and minimum pixel expan- 
sion m*{k, 7) = 72^“^. For any i = 1 , . . . , fc, the set Ri = {i? : there is M G 
Co U Cl such that the i-th row of M is i?} consists of two row vectors, one being 
the bitwise complement of the other. 

For any k >2 and any value of the contrast 7 > 1 , Theorems 0 and 0 provide a 
complete characterization of {k, fc)-threshold VCSs with minimum randomness 
and pixel expansion m*{k,"f), thus proving that, for any given 7 > 1, the con- 
struction of Theorem 0 is the only one providing a (A:, fc)-threshold VCS with 
minimum randomness k — 1 and contrast 7 = /i. 

5 Constructions for General Access Structures 

5.1 A Construction Using Cumulative Arrays 

Let r = (/Qjai, Eforb) be a strong access structure on a set of participants V, and 
let Emfs denote the collection of the maximal forbidden sets of T : Em fs = 
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{B S -Tporb : B U {i} e /quai : for alH G V \ B}. In this subsection we will 
show that the minimum randomness of any visual cryptography scheme is less 
than or equal to / — 1 where / is the size of Bmfs- Indeed, we show how to 
construct visual cryptography schemes with randomness / — 1 for any strong 
access structure. Our technique is a generalization of that given in Q which 
allows to obtain the basis matrices of a VCS for a given strong access structure 
from the basis matrices of a (/, /)-threshold VCS. We show how to obtain a VCS 
for a given strong access structure from any (/, /)-threshold VCS, not necessarily 
defined by means of basis matrices. Both our technique and that given in ^ are 
based on the cumulative array method introduced in Eg. A cumulative map 
{/3, T) for Cpuai is a finite set T along with a mapping f3 : V — > 2^ such that for 
any Q G 7^, it results UaeQ/^(“) = ^ Q ^ ^Quai- Let Bmfs = {^’i, • • • ,^/}- 
Given a set T = {T \, . . . , Ty}, we can construct a cumulative map (/3, T) for any 
Cquai by defining, for any i G V, f3{i) = {Tj\i ^ Fj, 1 < j < /}. 

A cumulative array is a |7^| x |T| boolean matrix, denoted by CA, such that 
CA{i,j) = 1 if and only if i ^ Fj. We can construct a visual cryptography 
scheme for any strong access structure F = (/Quai, Cporb) as follows. Let CA 
be the cumulative array for Fq^si obtained by using the cumulative map {13, T). 
Let Co = {M ^, . . . , M 2 /- 1 } and Ci = {M\, ■ ■ ■ , be two collections of / x 

for some h> 1, matrices realizing the (/, /)-threshold VCS of Theorem 
El The collections Cq = {M ^, . . . , and Ci = {M ^, . . . , for a visual 

cryptography scheme for the strong access structure (/quai, Cporb) are obtained as 
follows. For any fixed i let Ap, . . . be the integers j such that CA{i,j) = 1. 
The i-th row of (M/, resp.) consists of the “or” of the rows ■ ■ • di,gi of 
{Ml, resp.). Hence, the following theorem holds. 

Theorem 6. Let F = {Fq^ai, Tporb) be a strong access structure, and let Fmfs be 
the family of the maximal forbidden sets in Tporb- Then, there exists a 
(Cquai, Cporb)- Fes' with randomness \Fmfs\ ~ Ij pixel expansion m = 
and tx = m, for any X G /quai. 

5.2 Constructing VCSs from Smaller Schemes 

Ateniese et al. E showed how to construct the VCS for an access structure 
(Cquai,rForb) = (Fq^^i U F^'uapFporb ^ T^'^rb)^ °n a Set of n participants, using 
the VCSs for the structures (Fq^^i, and (Fq^j,, Fp^^p,). In the following we 
denote with o the operator “concatenation” of two matrices. We recall the 
following theorem from E- 

Theorem 7. Let {Fq^^^, and (Fq^j^i, Fp^^.^,) be two access structures on 

the same set of participants. Suppose there exist a {Fq^^^^, Fl^^^)-VCS and a 
(LquaiJ Lporb)" LCS” with basis matrices and T°, T^, respectively. Then, 

the matrices = Z^ o T^ and = Z^ o T^ are the the basis matrices of a 
(Lquai F Fq^j 3 |, FpQ|,|^ n Fll^^)-VCS. The randomness of the resulting VCS is equal 
to log((|Z'^| + |T°|)!). Lf the original access structures are both strong, then so is 
the resulting access structure. 
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Let Co (Cl, resp.) be the collection of all distinct matrices obtained from {S^, 
resp.) by permuting the columns of resp.) and, independently, those 

of f° (f\ resp.). Co and Ci realize a U /q.ah -^i^orb ^1 ^Forb)-VCS with 

randomness log(77i'! • to"!) = log(TO'l) + log(TO"l). 

Composition can be applied also to VCSs which are not represented by basis 
matrices, as the following lemma shows. Let C' and C" be two matrix collections. 
We denote with C'oC" the matrix collection {M^ oM^- : G C and G C"}. 

Lemma 2. Let the two matrix eolleetions Cq = . . . , Mj^} and C( = {Mi^, 

...,Mg(} realize a VCS with randomness TZ{Cq,C[) for the aecess strueture 
(^Quai> -^Forb) ® of participants V, and let the two matrix collections Cq = 

{Mi^ , . . . , and C" = {M^ ^, . . . , realize a VCS with randomness 

TZ{Cq,Ci) for the access structure on the same set of participants 

V. Suppose that for any X G Lq^^i \ and for any i G {!,..., so} and 

j G {1, . . . , si}, it results w{M^^[X]) = w{Mj^[X]), and that for any X G \ 
-^Quai S • ■ ■ !^o} (ind j G {1, . . . ,ti}, it results w{M^ *^[-^]) = 

w{XlC[X\). Then, the two matrix collections Cq = Cq o Cq and Ci = C( o C" 
realize a VCS for the access structure U n Cpo^b) random- 

ness log(min{|Co| • [Cq |, |C(| • |C"|}) > 7?.(Co, C() + 7?.(Co , C"). If the original access 
structures are both strong, then so is the resulting access structure. 

Lemma 0is often used in conjunction with the following theorem to obtain VCSs 
on a set of participants V from schemes on sets of participants contained in V . 

Theorem 8. Let (Iquai, CForb) be an access structure on a set of participants 
V. Let i be a non-essential participant of V and let {rQua\^ ^forb) access 

structure on V \ {i} with = {A \ {z} : A G Lquai}. Let the two matrix 
collections Cq and C( realize a VCS S' for the occess structure Lp^^pf). 

Then, the two matrix collections Co and C\, obtained by adding an all-zero row 
in correspondence of participant i to all matrices in Cq and Cj, realize a VCS for 
the access structure (/qaai, CForb), with the same randomness as S'. 

The following theorem is a consequence of Theorem El Lemma 0 and Theorem 
El It provides a technique to derive an upper bound on the randomness of the 
VCSs for any access structure. 

Theorem 9. Let (/quai, CForb) be a strong access structure on a set of par- 
ticipants V with basis Tq. There exists a {TQua\, Tf orb) -VCS with randomness 

6 Upper Bounds on the Randomness of (fc, n)-Threshold 
VCSs 

In this section we derive upper bounds on the randomness of {k, n)-threshold 
VCSs. We start by providing a construction for minimum randomness (2,n)- 
threshold VCSs. 
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Theorem 10. Let M be an nxm matrix whose rows consist of n distinct binary 
vectors each of weight w < m. Let cfs,t = 1 + [(s + t — 1) mod n]. For j = 1, . . . ,n, 
let Mj denote the nxm matrix having all n rows equal to and let Mj 

denote the nxm matrix such that -/hT,[{z}] = for i = 1, . . . , n. Then, 

the two collections of matrices Co = and Ci = 

define a strong (2, n) -threshold VCS with minimum randomness logn. 

Both Theorem El and Theorem El provide a construction for (k, n)-threshold 
VCSs. The former implies a (A:, n)-threshold VCS with randomness (^” 2 ) ~ 
whereas the latter implies a (fc, n)-threshold VCS with randomness {k — 1) (^) . In 
the following we prove two upper bounds on the randomness of {k, n)-threshold 
VCSs which represent an improvement on the above mentioned upper bounds. 
To this aim, we use a technique based on starting matrices, along the same line 
of m where an analogous technique is employed to obtain (fc, n)-threshold VCSs 
realized by means of basis matrices. 

Definition 2. A starting matrix SM(n,i,k) is a n x i matrix whose entries 
are elements of a ground set {ai, . . . , ak}, with the property that, for any subset 
of k rows, there exists at least one column such that the entries in the k given 
rows of that column are all distinct. 

Let M* be a starting matrix SM{n,£,k). We can construct a (A:, n)-threshold 
VCS as follows. Let Co = {M°, . . . , and Ci = {M^ , . . . , be 

the two collections of A: x for some h > 1, matrices realizing the {k,k)~ 

threshold VCS of Theorem El Let C( denote the t-th column of M* . For j = 
let Mj{ct} {Mj{ct}, resp.) denote the nxm matrix obtained by 
replacing each Oi-entry in C(, for i = l,...,k, with the z-th row of , 

resp.). For j = 1, . . . ,2^“^, a row of M°{c(} resp.) corresponding to 

an Oi-entry of Cj is called a^-row. 

Theorem 11. LetCo = {M^, . . . , andCi = {Ml, . . . , M^k-i} be the two 

collections of k x h2^~^, for some h > 1, matrices realizing the {k, k) -threshold 
VCS of Theorem 0 and let Ct denote the t-th column of a starting matrix 
SM{n, I, k). For i = 1,. . . ,k, let Qi^t = {v : v G {1, ... ,n} and the v-th entry of 
Ct is equal to at}. The two matrix collections I?o,t = {Mi{ct}, . . . , M^,,_i{ct}} 
and T>i t = {Ml{ct}, . . . , realize a VCS with randomness k — 1 for 

the strong access structure (/quai, Fporb) on participant set V = {l,...,n} with 
basis Fo = [X C V : | = k and \X n Qi^t\ = 1, for i = 1 . . . , k}. 

Let {ci, . . . , cg} be the set of the columns of the starting matrix M* and let 
1 ^ 0 , t = and = {Ml{ct}, . . . , for t = 

Let T>o = {T>o,i o ... o T>o^e} and T>i — {T>ip o ... o Lemma 0 

and Theorem mi imply that T>o and T>i realize a strong (A:, n)-threshold VCS. 
Hence, the following theorem holds. 

Theorem 12. Lf there exists a starting matrix SM(n,£,k) then there exists a 
strong (k,n) -threshold VCS with randomness (k — V)£. 
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The SM matrix is a representation of a Perfect Hash Family (or PHF). Fredman 
and Komlos UDI proved that for any PHF it holds that £ = f2{k^ ^/fc!)logn. 
They also proved the weaker but simpler bound £ = 17(1/ log fc) logn. Mehlhorn 
m proved that there exist PHFs with £ = 0(fce^)logn. These bounds are in 
general non-constructive, but in uni there can be found a recursive construction 
which for any constant k > 2 and for any integer n > k, yields a PHF with 
£ = ” logn). Therefore, the following corollaries of Theorem IT^ hold. 

Corollary 1. For any k and n with 2 < k < n, there exists a (k,n) -threshold 
VCS with randomness O(fc^e^) log n. 



Corollary 2. Let k > 2 be a constant. For any n > k there exists a constructible 
(k,n) -threshold VCS with randomness ”“''^^logn). 



Naor et al. 11 4| 


Ateniese et al. Q 


Thm. El 


log(2'=-b) 


log ((0(fc(2e)'‘)logn)!) 




Thm.EI 


Cor. 0] 


Cor. El 




0{k^e’^) logn 


0(fc(2i°s* logn), 

for k constant 



The above table reports the values of previously known upper bounds on the 
randomness of the {k, n)-threshold VCSs, along with those derived in the present 
paper. The first upper bound reported in the table is relative to the random- 
ness of the very first (fc, n)-threshold VCS described in literature jHj. By 
Stirling approximation formula, the randomness of this scheme is larger than 
nf log (^\/2^tt{ 2^~^ / e)"^ ^ . Then, the randomness of this scheme is larger than 

the randomness of all (/c, n)-threshold VCSs described in the present paper. 

Since {jjfi) = n-’t+i (fe) ’ then the randomness of the scheme implied by The- 
orem 0 is always smaller than or equal to that of the scheme implied by The- 
orem 0 Notice that the upper bound of Corollary 0 is asymptotically smaller 
than that implied by Theorem El when fc is a sublinear function of n, whereas 
the bound of Corollary El is asymptotically smaller than that implied by The- 
orem El for any fixed fc with 2 < fc < n. In our knowledge, all efficient con- 
structions for (fc, n)-threshold VCS, given so far in literature, use basis ma- 
trices. The randomness of such a scheme with pixel expansion m is equal to 
log(m!). For that reason, it is interesting to compare the (fc, n)-threshold VCSs 
of Corollary [Hand Corollary El with the (fc, n)-threshold VCS having the smallest 
pixel expansion among those given so far in literature P^. This (fc, n)-threshold 
VCS has pixel expansion 0(fc(2e)^) log n. By Stirling approximation formula, 
the asymptotical upper bound on the randomness of this scheme is larger than 
0(fc(2e)^)(logn)(log(0(fc2^e^“^) logn)). As a consequence, this upper bound is 
larger than those of both corollaries. 
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7 Conclusions and Open Problems 

In this paper the randomness of visual cryptography schemes has been inve- 
stigated. Lower bounds on the randomness of VCSs for general access struc- 
tures and {k, n)-threshold VCSs have been derived and general techniques to 
construct (fc, n)-threshold VCSs have been presented and analyzed. We have 
also provided minimum randomness constructions for (2, n)-threshold VCSs and 
{k, fc)-threshold VCSs. All proofs and examples of our results are given in the 
extended version of the paper. That version also provides minimum randomness 
VCSs for all strong access structures on at most four participants. Further results 
on the randomness of visual cryptography schemes can be found in |Dj . 

Randomness is an aspect of visual cryptography which has been considered 
in this paper for the first time. Nevertheless, it deserves further investigations. 
Indeed, many problems are left open. The most challenging one is to reduce 
the gap between our lower bounds and upper bounds to the randomness of the 
(A:, n)-threshold VCSs for 2 < k < n. 
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Abstract. We consider the following online dial-a-ride problem 
(OlDarp): Objects are to be transported between points in a metric 
space. Transportation requests arrive online, specifying the objects to 
be transported and the corresponding source and destination. These re- 
quests are to be handled by a server which starts its work at a designated 
origin and which picks up and drops objects at their sources and desti- 
nations. The server can move at constant unit speed. After the end of its 
service the server returns to its start in the origin. The goal of OlDarp 
is to come up with a transportation schedule for the server which finishes 
as early as possible, i.e., which minimizes the makespan. 

We analyze several competitive algorithms for OlDarp and establish 
tight competitiveness results. The first two algorithms, REPLAN and 
IGNORE are very simple and natural: REPLAN completely discards its 
(preliminary) schedule and recomputes a new one when a new request ar- 
rives. IGNORE always runs a (locally optimal) schedule for a set of known 
requests and ignores all new requests until this schedule is completed. We 
show that both strategies, REPLAN and IGNORE, are 5/2-competitive. 
We then present a somewhat less natural strategy SMARTSTART, which 
in contrast to the other two strategies may leave the server idle from 
time to time although unserved requests are known. The SMARTSTART- 
algorithm has an improved competitive ratio of 2, which matches our 
lower bound. 



1 Introduction 

Transportation problems where objects are to be transported between given 
sources and destinations in a metric space are classical problems in combinatorial 
optimization. In the classical setting, one assumes that the complete input for 
an instance is available for an algorithm to compute a solution. In many cases 
this offline optimization does not reflect the real-world situation appropriately. 
For instance, the transportation requests in an elevator system are hardly known 
in advance. Decisions have to be made online without the knowledge of future 
requests. 

* Research supported by the German Science Foundation (grant 883/5-2) 
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Online algorithms are tailored to cope with such situations. Whereas offline 
algorithms work on the complete input sequence, online algorithms only get to 
see the requests released so far and thus have to account for future requests that 
may or may not arise at a later time. A common way to evaluate the quality 
of online algorithms is competitive analysis An algorithm ALG is called c- 
competitive if its “cost” on any input sequence is at most c-times the optimum 
offline cost. 

In this paper we consider the following online dial-a-ride problem (OlDarp): 
Objects are to be transported between the points of a metric space. A request 
consists of the objects to be transported and the corresponding source and des- 
tination of the transportation request. The requests arrive online and must be 
handled by a server which starts and ends its work at a distinguished origin. The 
server picks up and drops objects at their sources and destinations, respectively. 
The goal of OlDarp is to come up with a transportation schedule for the server 
which finishes as early as possible. 

Related Work. We do not claim originality for the two online-algorithms IG- 
NORE and REPLAN; instead, we show how to analyze them for OlDarp and 
how ideas from both strategies can be used to construct a new online strategy 
SMARTSTART with better competitive ratio. 

The first — to the best of our knowledge — occurrence of the strategy IGNORE 
can be found in the paper by Shmoys, Wein, and Williamson m-- They show 
a general result about obtaining competitive algorithms for minimizing the to- 
tal completion time (also called the makespan) in machine scheduling problems 
when the jobs arrive over time: If there is a p-approximation algorithm for the of- 
fline version, then this implies a 2p-competitive algorithm for the online- version, 
which is essentially the IGNORE strategy. The results from [El show that IG- 
NORE-type strategies are 2-competitive for a number of online-scheduling prob- 
lems. The strategy REPLAN is probably folklore; it can be found also under 
different names like reopt or optimal. 

The difference of OlDarp studied here to the machine scheduling problems 
treated in [El are as follows: For OlDarp, the “execution time” of jobs depends 
on their execution order. OlDarp can be viewed as a generalized scheduling 
problem with setup costs and order dependent execution times. It should be 
stressed that in this paper we do not allow additive constants in the definition of 
the competitive ratio. If one allowed an additive constant equal to the diameter 
of the metric space, then the algorithms REPLAN and IGNORE would in fact be 
also 2-competitive. 

In the authors studied the Online Traveling Salesman Problem 

(OlTsp) which is obtained as a special case of OlDarp treated in this pa- 
per, when for each request its source and destination coincide. It is shown in 0 
that there is a metric space (the boundary of the unit square) where any deter- 
ministic algorithm for the OlTsp has a competitive ratio of at least 2. For the 
case that the metric space is the real line, a lower bound of « 1.64 is given 

in A 2-competitive algorithm that works in an arbitrary (symmetric) metric 
space and a 7 /4-competitive algorithm for the real line are presented in jSj . 
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Recently, in an independent effort Feuerstein and Stougie |3 analyzed the 
algorithm IGNORE (which they call dit for “Don’t listen while traveling”) and 
established the same competitive ratio as in this paper. 

Our Contribution. We provide a number of competitive algorithms for the 
basic version of the problem OlDarp. The best algorithm, SMARTSTART, has 
competitive ratio 2 which improves the result of 5/2 established in 0. This 
competitiveness result closes the gap to the lower bound. 

Our algorithm SMARTSTART induces an alternative 2-competitive algo- 
rithm for the OlTsp. Moreover, SMARTSTART can be used to obtain a 
competitive polynomial time algorithm for the OlTsp. Since « 2.6514, 

this improves the result of ISEI, where a 3-competitive polynomial time algo- 
rithm for the OlTsp was presented. 

IGNORE and SMARTSTART can be applied to the generalization of OlDarp 
(and OlTsp) where there are fc-servers with arbitrary capacities C\, . . . ,Ck G IN. 
Their competitive ratios of 5/2 of 2, respectively, hold as well in the generalized 
case. As a corollary, we obtain 2-competitive algorithms for a number of machine 
scheduling problems with setup-costs which generalizes a result from m- 

Paper Outline. This paper is organized as follows: In Section|2|we formally de- 
fine the problem OlDarp and introduce notation. In Section0we establish lower 
bounds for the competitive ratio of deterministic online algorithms for OlDarp. 
In Section 21 we analyze two simple strategies, REPLAN and IGNORE. Sectional 
contains our improved algorithm SMARTSTART. The competitive ratios of our 
three algorithms are summarized in Table ^ 



Table 1. Competitive ratios of algorithms in this paper. 



Algorithm 


REPLAN 


IGNORE 


SMARTSTART 


Lower bound 


Closed Schedules 


5/2 5/2 2 


2 



2 Preliminaries 

An instance of the basic online dial-a-ride problem OlDarp consists of a met- 
ric space M = (X,d) with a distinguished origin o G X and a sequence cr = 
ri, . . . of requests. It is assumed that for all pairs (x,y) of points from M, 
there is a path p : [0, 1] —> A in A with p(0) = x and p(l) = y of length d{x, y) 
(see [S| for a thorough discussion of the model). Examples of a metric spaces 
that satisfy the above condition are the Euclidean space and a metric space 
induced by an undirected edge-weighted graph. 

Each request is a triple = (ti,ai,6i), where ti is a real number, the time 
where request is released (becomes known), and Ui G V and bi G V are 
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the source and destination, respectively, between which the new object is to be 
transported. We assume that the sequence cr = ri, . . . , of requests is given in 
order of non-decreasing release times. 

A server is located at the origin o € X a,t time 0 and can move at constant 
unit speed. In the basic version the server has unit- capacity, i.e., it can carry at 
most one object at a time. (Extensions to the case of more servers with arbitrary 
capacities are also considered in this paper.) We do not allow preemption: once 
the server has picked up an object, it is not allowed to drop it at any other place 
than its destination. 

An online algorithm for OlDarp does neither have information about the 
release time of the last request nor about the total number of requests. The online 
algorithm must determine the behavior of the server at a certain moment t of 
time as a function depending on all the requests released up to time t and on 
the current time t. In contrast, the offline algorithm has information about all 
requests in the whole sequence a already at time 0. 

Given a sequence cr of requests, a valid transportation schedule for cr is a 
sequence of moves of the server such that the following conditions are satisfied: 
(a) The server starts its movement in the origin vertex o, (b) each transportation 
request in a is served, but starting not earlier than the time it becomes known, 
and (c) the server returns to the origin vertex after having served the last request. 

The objective function of the OlDarp is the (total) eompletion time (also 
called the “makespan’’’’ in scheduling) of the server, that is, the time when the 
server has served all requests and returned to the origin. 

Let ALG((t) denote the completion time of the server moved by algorithm ALG 
on the sequence a of requests. We use OPT to denote an optimal offline algo- 
rithm. An online algorithm ALG for OlDarp is c- competitive, if there exists 
a constant c such that for any request sequence a the inequality ALG (cr) < 
c • OPT(cr) holds true. It should be mentioned that in sometimes literature the 
definition of the competitive ratio allows an additive constant. In that case a c- 
competitive algorithm with additive constant zero is then referred to as strictly 
c- competitive. In this paper we do not allow additive constants. 

3 Lower Bounds 

We first address the question how well an online algorithm can perform compared 
to the optimal offline algorithm. Since OlDarp generalizes the OlTsp we obtain 
the following result from the lower bound established in 0: 

Theorem 1. If ALG is a deterministic c-competitive algorithm for OlDarp, 
then c >2. □ 

For the case that the metric space is the real line, a lower bound of « 

1.64 was given in n We provide an improved lower bound for OlDarp below: 

Theorem 2. If ALG is a deterministic c-competitive algorithm for OlDarp on 
the real line, then c > 1 -L •\/2/2 ~ 1.707. 
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Proof. Suppose that A LG is a deterministic online algorithm with competitive 
ratio c < 1 + '/2l2. At time t = 0, the algorithm A LG is faced with two requests 
ri = (0,0,2) and r 2 = (0,2,o). The optimum offline cost to serve these two 
requests is 4. 

The server operated by ALG must start serving request V 2 at some time 
2 < T < 4c — 2, because otherwise ALG could not be c-competitive. At time T 
the adversary issues another request ra = (T, T, 2). Then OPT(ci, r 2 , ra) = 2T. 
On the other hand, ALG(ri, T 2 , ra) > 3T+2. Thus, the competitive ratio c of ALG 
satisfies 

3T+2 3 1 3 1 

c > > - + — > - H . 

- 2T - 2 T - 2 Ac-2 

The smallest value c > 1 such that c > 3/2 + l/(4c — 2) is c = 1 + \f2j2. □ 

4 Two Simple Strategies 

In this section we present and analyze two very natural online-strategies for for 
OlDarp and prove both of them to be 5/2-competitive. 

Strategy REPLAN As soon as a new request arrives the server completes the 
current carrying move (if it is performing one), then the server stops and 
replans: it computes a new shortest schedule which starts at the current 
position of the server, takes care of all yet unserved requests, and returns to 
the origin. 

Strategy IGNORE The server remains idle until the point in time t when the 
first requests become known. The algorithm then serves the requests released 
at time t immediately, following a shortest schedule S. All requests that ar- 
rive during the time when the algorithm follows this schedule are temporarily 
ignored. After S has been completed and the server is back in the origin, the 
algorithm computes a shortest schedule for all unserved requests and follows 
this schedule. Again, all new requests that arrive during the time that the 
server is following the schedule are temporarily ignored. A schedule for the 
ignored requests is computed as soon as the server has completed its cur- 
rent schedule. The algorithm keeps on following schedules and temporarily 
ignoring requests this way. 

Both algorithms above repeatedly solve “offline instances” of OlDarp. These 
offline instances have the property that all release times are at least as large as 
the current time. Thus, the corresponding offline problem is the following: given 
a number of transportation requests (with release times all zero), find a shortest 
transportation for them. 

For a sequence a of requests and a point x in the metric space M let L* {t, x, a) 
denote the length of a shortest schedule (i.e., the time difference between its 
completion time and the start time f) which starts in x at time t, serves all 
requests from a (but not earlier than their release times), and ends in the origin. 
Clearly, for t' > t we have that L*{t',x,a) < L*{t,x,a). Moreover, OPT(cr) = 
L*(0,o, (t) and thus OPT(cr) > L*{t,o,a) for any time t > 0. 
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Since the optimum offline server OPT cannot serve the last request = 
(tm, a™, bm) from a before this request is released we get that 

OPT(cr)>max{L*(t,o,cr),tm + d(am,bm) + d(bm,o)} for any t > 0. (1) 

Lemma 1. Let cr = ri, . . . , be a sequenee of requests. Then for any t > tm 
and any request ri = {ti, ai,bi) from a 

L*{t, a\ri) < L*{t, o, a) - d(a*, bi) + d{ai, o). 

Here cr\ri denotes the sequenee obtained from a by deleting the request ri. 

Proof. Consider an optimum schedule S* which starts at the origin o at time t, 
serves all requests in a and has length L*{t, o, a). It suffices to construct another 
schedule S which starts in bi no earlier than time t, serves all requests in a\ri 
and has length at most L*(t, o, a) — d{ai, bi) + d{ui, o). 

Let S* serve the requests in the order , . . . , rj^ and let = rj^. . Notice 
that if we start in b at time t and serve the requests in the order 

'I'jk + l I • • ■ J Cm J Cl ; ■ • ■ J Cfc-1 

and move back to the origin we obtain a schedule S with the desired properties. 

□ 



We are now ready to prove the result about the performance of REPLAN: 
Theorem 3. Algorithm REPLAN is 5 /2-eompetitive. 

Proof. Let a = ri, . . . ,rm be any sequence of requests. We distinguish between 
two cases depending on the current load of the REPLAN-server at the time tm 
(i.e., the time when the last request is released). 

If the server is currently empty it recomputes an optimal schedule which 
starts at its current position, denoted by s(tm), serves all unserved requests, 
and returns to the origin. This schedule has length at most L*(tm, s(tm), cr) < 
d{o, s{tm)) + L*{tm, O, a). Thus, 

REPLAN(cr) < tm + d{o,s{tm)) + L*(fm,o,a) 

m 

< tm + d{o, s{tm)) + OPT (a) (2) 

New now consider the second case, when the server is currently serving a request 
r = (t,a,b). The time needed to complete this move is d{s{tm),b). Then a 
shortest schedule starting at b serving all unserved requests is computed which 
has length at most L*{tm,b,a\r). Thus in the second case 

REPLAN((t) <tm + d{s{tm),b) + L*{tm, b, a\r) 

< tm + d{s{tm), b) + L*(tm, o, O') — rf(a, 6) + d(a, o) by LemmaQ] 
<tm + OPT (cr) - d(a, b) + d{s{tm), b) + d{a, s{tm)) +d{s{tm), o) 

' V " 

— rf(a,fa) 

= tm + d{ 0 , s{tm)) + OPT((t). 
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This means that inequality 0 holds in both cases. Since the REPLAN server has 
traveled to position s(tm) at time there must be a request rj = (tj,aj,bj) 
in a where either d{o,ai) > d{o,s(tm)) or d{o,bi) > d{o, s{tm))- By the triangle 
inequality this implies that the optimal offline server will have to travel at least 
twice the distance c?(o, s{tm)) during its schedule. Thus, d(o, s{tm)) < OPT((t)/ 2. 
Plugging this result into inequality @ we get that the total time the REPLAN 
server needs is no more than 5/2 0 PT((t). □ 

We are now going to analyze the competitiveness of the second simple strat- 
egy IGNORE. 

Theorem 4. Algorithm IGNORE is 5 /2-competitive. 

Proof. Consider again the point in time tm when the last request r„i becomes 
known. If the IGNORE server is currently idle at the origin o, then its completes 
its last schedule no later than tm + where is the set of 

requests released at time tm- Since < OPT(cr) andOPT(cr) > tm, 

it follows that in this case IGNORE completes no later than time 2 OPT((t). 

It remains the case that at time tm the IGNORE-server is currently working 
on a schedule S for a subset as of the requests. Let ts denote the starting time of 
this schedule. Thus, the IGNORE-server will complete S at time ts+L*{ts, o, as). 
Denote by a>ts the set of requests presented after the IGNORE-server started 
with S at time ts. Notice that a>tg is exactly the set of requests that are served 
by IGNORE in its last schedule. The IGNORE-server will complete its total service 
no later than time ts + L*{ts, o, as) + L*{tm, o, a>ts)- 

Let r/ G a>ts be the first request from cr>tg served by OPT. Thus 

OPT(cr) >tf + L*{tf,af,a>ts) >ts + L*{tm,af,a>ts)- (3) 

Now, L*{tm,o,a>ts) < d{o,af) + L*{tm,af,a>ts) and L*(ts,o,as) < OPT(cr). 
Therefore, 



IGNORE((t) < ts + OPT{a) + d{o,af) + L*{tm,af,cr>ts) 

O) 

< 2 OPT {a) + d{o, a f) 

< ^OPT(n). 

This completes the proof. □ 

The following instance shows that the competitive ratio of 5/2 proved for 
IGNORE is asymptotically tight even for the case when the metric space is the real 
line. At time 0 there is a request ri = (0, 1, 0). The next requests are r 2 = (e, 2, 3) 
and rs = (2 -L e, 2, 1). It is easy to see that IGNORE(ri, r 2 , ra) = 10-1- 4e, while 
OPT(ri, T 2 , ra) = 4-|- 2e. Thus, the ratio IGNORE(ri,r 2 ,ra)/OPT(ri,r 2 ,ra) can 
be made arbitrarily close to 5/2 by choosing £ > 0 small enough. 
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4.1 Application to Machine Scheduling Problems 

We close this section by showing how the IGNORE strategy can be applied to a 
generalization of the basic problem OlDarp and how this generalization can be 
used to model certain machine scheduling problems where there are setup costs 
for different job types. Consider the generalization fc-OuDARP of OlDarp when 
there are A: G IN servers with arbitrary capacities Ci, . . . , Cfc G IN. Capacity Cj 
means that server j can carry at most C items at a time. The objective function 
for /c-OlDarp is the time when the last server has returned to the origin (after 
all requests have been served). 

For A;-OlDarp the IGNORE strategy always plans schedules for its servers 
such that the length of the longest schedule is minimized. All schedules are 
constructed in such a way that they start and end in the origin. New requests 
are ignored until the last of the servers has returned to the origin. It is not too 
hard to see that the proof of Theorem 0 remains valid even for A:-OlDarp: 

Theorem 5. Algorithm IGNORE is 5/2-competitive even for the extension k- 
OlDarp of OlDarp where there are fc G IN servers with arbitrary capacities 
Cl,..., Cfc G IN. □ 

Suppose there are k uniform machines where jobs can be run and assume 
that there are £ types of jobs. For each job type j setup cost of Sj is given: 
if a job of type k ^ j has been processed immediately before on a machine, 
then an additional cost of Sj is incurred to start processing a job of type j on 
this machine. This setup cost models for instance the situation where special 
auxiliary device must be installed at the machine to perform a certain job type. 
The problem of minimizing the makespan in this machine scheduling problem 
when jobs arrive over time can be modeled as /c-OlDarp (with k unit-capacity 
servers) on a star shaped metric space with center o: For each of the p job types 
there is one ray in the metric space emanating from p. On ray Rj there is a 
special point Xj at distance Sj/2 from o. A job of type j with processing time p 
is modeled by a transportation request from the point Xj to the point Xj +pf2. 

5 The SMARTSTART Strategy 

In this section we present and analyze our algorithm SMARTSTART which 
achieves a best-possible competitive ratio of 2 (cf. the lower bound given in 
Theorem Q]). The idea of the algorithm is basically to emulate the IGNORE- 
strategy but to make sure that each sub-schedule is completed “not too late” : if 
a sub-schedule would take “too long” to complete then the algorithm waits for a 
specified amount of time. Intuitively this construction tries to avoid the worst- 
case situation for IGNORE where right after the algorithm started a schedule a 
new request becomes known. 

The algorithm SMARTSTART has a fixed “waiting scaling” parameter 6 > 
1. From time to time the algorithm consults its “work-or-sleep” routine: this 
subroutine computes an (approximately) shortest schedule S for all unserved 



Online Dial-a-Ride Problems: Minimizing the Completion Time 



647 



requests starting and ending in the origin. If this schedule can be completed no 
later than time Ot (where t is the current time) the subroutine returns (S', work), 
otherwise it returns (S, sleep). 

In the sequel it will be convenient to assume that the “work-or-sleep” sub- 
routine uses a p-approximation algorithm for computing a schedule: the approxi- 
mation algorithm always finds a schedule of length at most p times the optimum 
one. While in online computation one is usually not interested in time complexity 
(and thus in view of competitive analysis we can assume that p = 1), employ- 
ing a polynomial time approximation algorithm will enable us to get a practical 
algorithm (and in particular to improve the result of I5I4I for the OlTsp). 

The server of algorithm SMARTSTART can assume three states: 

idle In this case the server has served all known requests, is sitting in the origin 
and waiting for new requests to occur. 

sleeping In this case the server knows of some unserved requests but also knows 
that they take too long to serve (what “too long” means will be formalized 
in the algorithm below) . 

working In this state the algorithm (or rather the server operated by it) is 
following a computed schedule. 

We now formalize the behavior of the algorithm by specifying how it reacts 
in each of the three states. 

Strategy SMARTSTART — If the algorithm is idle at time T and new re- 
quests arrive, calls “work-or-sleep”. If the result is (S', work), the algo- 
rithm enters the working state where it follows S. Otherwise the algo- 
rithm enters the sleeping state with wakeup time t' , where t' > T is the 
earliest time such that t' + 1{S) < 9t' , where 1{S) is the length of the 
just computed schedule S, i.e., t' = min{ t>T : t + 1{S) < 6t }. 

— In the sleeping state the algorithm simply does nothing until its wakeup 
time t' . At this time the algorithm reconsults the “work-or-sleep” sub- 
routine. If the result is (S, work), then the algorithm enters the working 
state and follows S. Otherwise the algorithm continues to sleep with new 
wakeup time min{ t >f : t + 1{S) < 9t}. 

— In the working state, i.e, while the server is following a schedule all 
new requests are (temporarily) ignored. As soon as the current sched- 
ule is completed the server either enters the idle-state (if there are no 
unserved requests) or it reconsults the “work-or-sleep” subroutine which 
determines the next state (sleeping or working) . 

Theorem 6. For all 9 > p, 9 > 1, Algorithm SMARTSTART is c-competitive 
with 




Moreover, the best possible choice of 9 ^ (l -I- y/1 -\- 8p) and yields a competitive 



ratio of j (4p -|- 1 -I- + 8p) . 
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Proof. Let be the set of requests released at time tm, where tm denotes 
again the point in time when the last requests becomes known. We distinguish 
between different cases depending on the state of the SMARTSTART-server at 
time tm- 

Case 1: The server is idle. 

In this case the algorithm consults its “work-or-sleep” routine which com- 
putes an approximately shortest schedule S for the requests in cr=t„ ■ The SMART- 
START -server will start its work at time t' = min{ t + l{S) < 6 t }, where 

1{S) = denotes the length of the schedule S. 

If = tm, then by construction the algorithm completes no later than 
time 6 tm < 0OPT(cr). Otherwise t' > tm and it follows that t' + 1{S) = 9t' . 
By the performance guarantee of p of the approximation algorithm employed in 
“work-or-sleep” we have that OPT((t) > l{S)/p = t'. Thus, it follows that 

smartstart((t) = t ' + i { s ) < et ' < e - = p(i + ) opt(ct). 

U — \ \ U — L J 

Case 2: The server is sleeping. 

Note that the wakeup time of the server is no later than min{ t > tm ■ 
t + 1{S) < 9t }, where S is now a shortest schedule for all the requests in a not 
yet served by SMARTSTART at time tm, and we can proceed as in Case 1. 
Case 3: The algorithm is working. 

If after completion of the current schedule the server enters the sleeping 
state then the arguments given above establish that the completion time of the 
SMARTSTART-server does not exceed p -I- OPT(cr). 

The remaining case is that the SMARTSTART-server starts its final sched- 
ule S' immediately after having completed S. Let ts be the time when the 
server started S and denote by (T>ts the set of requests presented after the 
server started S at time ts- Notice that a>ts is exactly the set of requests that 
are served by SMARTSTART in its last schedule S' . 

SMARJSTART{a)=ts + l{S) + l{S'). (4) 

Here 1{S) and l(S') < pL*{tm,o,a>ts) denotes the length of the schedule S 
and S' , respectively. We have that 

ts + 1{S) < 9ts, (5) 

since the SMARTSTART only starts a schedule at some time t if it can complete 
it not later than time 9t. Let ry G cr>ts b® tbe first request from a>ts served by 

OPT. 

Using the arguments given in the proof of Theorem 0 we conclude that 

OPT{a) >ts + L*{tm,af,a>ts)- (6) 

Moreover, since the tour of length L*(tm,o,f,<^>ts) starts in a/ and returns to 
the origin it follows from the triangle inequality that L*{tm, o-f, <^> 13 ) ^ d{o,af). 
Thus, from o we get 



OPT(cr) >ts + d{o,af). 



( 7 ) 
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On the other hand 



K^') < p{d{o,af) + L*{tm,af,a>ts)) 

O 

< p(OPT(cr) - ts + d(o,a/)) . (8) 

Using (0 and (jS|) in (EJ and the assumption that 9 > p, we obtain 

SMARTSTART(cr) < 9ts + 1{S^) by 0 

< {9 — p)ts + pd{o, a f) + p OPT (a) by 0 

< 9 OPT{a) + {2p — 9)d{o,af) by 0 

< f 9 OPT(cr) + (2p - 9) QP ,H9<2p 

“[6»OPT(cr) ,T9>2p 

< max { I + p, 0} OPT((t) 

This completes the proof. □ 



For “pure” competitive analysis we may assume that each schedule S com- 
puted by “work-or-sleep” is in fact an optimal schedule, i.e., that p = 1. The 
best competitive ratio for SMARTSTART is then achieved for that value of 9 
where the three terms 9,1+ and | + 1 are equal. This is the case for 0 = 2 
and yields a competitive ratio of 2. We thus obtain the following corollary. 

Corollary 1. For p = 1, 9 — 2 algorithm SMARTSTART is 2 -competitive. □ 

The offline dial-a-ride problem is also known as the Stacker-Crane-Problem. 
In 1101 the authors present a 9/5-approximation algorithm. On paths the problem 
can be solved in polynomial time 0m. In [3 an approximation algorithm for 
the single server dial-a-ride problem with performance 0{VC log nloglog n) was 
given, where C denotes the capacity of the server. 

For the special case of the Online Traveling Salesman Problem (OlTsp) 
Christofides’ algorithm |H| yields a polynomial time approximation algorithm 
with p = 3/2. For this value of p, the best competitive ratio of SMARTSTART 
is attained for 9 = and equals « 2.6514. Thus, for the OlTsp our 

algorithm SMARTSTART can be used to obtain a polynomial time competitive 
algorithm with competitive ratio approximately 2.6514. This improves the result 
of jf)l4] where a 3-competitive polynomial time algorithm for OlTsp was given. 

We finally note that the SMARTSTART-strategy inherits some desirable 
properties from the IGNORE-strategy: The algorithm can also be used for k- 
OlDarp and provides the same competitive ratio. 

Theorem 7. Algorithm SMARTSTART is 2-competitive for the extension k- 
OlDarp of OlDarp where there are k G IN servers with arbitrary capacities 
Ci,...,CkGlN. □ 

The last result implies a 2-competitive algorithm for the machine scheduling 
problems with setup costs discussed in Section ITTl 
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6 Remarks 

Our investigations of the OlDarp were originally motivated by the performance 
analysis of a large distribution center of Herlitz AG, Berlin Its automatic pal- 
let transportation system employs several vertical transportation systems (el- 
evators) in order to move pallets between the various floors of the building. 
The pallets that have to be transported during one day of production are not 
known in advance. If the objective is chosen as minimizing the completion time 
(makespan) then this can be modeled by the OlDarp where the metric space 
is induced by a graph which is a simple path. 
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Abstract. Given a finite set S of points (i.e. the stations of a radio 
network) on the plane and a positive integer 1 < h < |S'| — 1, the 
2 d Min h R. Assign, problem consists of assigning transmission ranges 
to the stations so as to minimize the total power consumption provided 
that the transmission ranges of the stations ensure the communication 
between any pair of stations in at most h hops. 

We provide a lower bound on the total power consumption opt^(S') 
yielded by an optimal range assignment for any instance (S', h) of 2d 
Min h R. Assign., for any positive constant h > 0. The lower bound 
is a function of |S|, h and the minimum distance over all the pairs of 
stations in S. Then, we derive a constructive upper bound for the same 
problem as a function of |S|, h and the maximum distance over all the 
pairs of stations in S (i.e. the diameter of S). Finally, by combining the 
above bounds, we obtain a polynomial-time approximation algorithm for 
2d Min h R. Assign, restricted to well-spread instances, for any positive 
constant h. 

Previous results for this problem were known only in special 1-dimensional 
configurations (i.e. when points are arranged on a line). 



Keywords: Approximation Algorithms, Lower Bounds, Multi-Hop Packet Ra- 
dio Networks, Power Consumption. 



1 Introduction 

A Multi-Hop Packet Radio Network |S| is a finite set of radio stations located on a 
geographical region that are able to communicate by transmitting and receiving 
radio signals. A transmission range is assigned to each station s and any other 
station t within this range can directly (i.e. by one hop) receive messages from s. 
Communication between two stations that are not within their respective ranges 
can be achieved by multi-hop transmissions. In general, Multi-Hop Packet Radio 
Networks are adopted whenever the construction of more traditional networks 
is impossible or, simply, too expensive. 
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It is reasonably assumed 0 that the power Pt required by a station t to 
correctly transmit data to another station s must satisfy the inequality 



Pt 

d{t, s)^ 



> 7 



( 1 ) 



where d(t, s) is the distance between t and s, (3 > 1 is the distance-power gradient, 
and 7 > 1 is the transmission-quality parameter. In an ideal environment (see jOI) 
f3 = 2 but it may vary from 1 to more than 6 depending on the environment 
conditions of the place the network is located. In the rest of the paper, we fix 
(3 = 2 and 7=1, however, our results can be easily extended to any /3,'y > 1. 

Given a set S = {si, . . . , s„} of radio stations on an Euclidean space, a range 
assignment for S' is a function r : S — > 7?.'^, and the cost of r is defined as 



n 

cost(r) = r(sj)^. 



As defined in the abstract, the 2d Min h R. Assign, problem consists of 
finding a minimum cost range assignment for a given set S of radio stations 
on the plane provided that the assignment ensure the communication between 
any pair of stations in at most h hops, where h is an input integer parameter 
(l<h< |S|-1). 



1.1 Previous Works 

Combinatorial optimization problems arising from the design of radio networks 
have been the subject of several papers over the last years (see 0 for a survey). In 
particular, NP-completeness results and approximation algorithm for scheduling 
communication in radio networks have been derived in j I l.'llYlisj . Kirousis et al, 
in investigated the complexity of the Min R. Assign, problem that consists 
of minimizing the overall transmission power assigned to a set S of stations of a 
radio network, provided that (multi-hop) communication is guaranteed for any 
pair of stations (notice that no bounds are required on the maximum number of 
hops for the communication). It turns out that the complexity of this problem 
depends on the dimension of the space the stations are located on. In the 1- 
dimensional case (i.e. when the stations are located along a line) they provide a 
polynomial-time algorithm that finds a range assignment of minimum cost. As 
for stations located in the 3-dimensional space they show that Min R. Assign. 
is NP-hard. They also provide a polynomial-time 2-approximation algorithm 
that works for any dimension. Then, Clementi et al in 0 proved that the Min 
R. Assign, problem in three dimensions is APX-complete thus implying that 
it does not admit PTAS unless P = NP (see 0 for a formal definition of these 
concepts). They also prove that the Min R. Assign, problem is NP-hard in the 
2-dimensional case. 

All the results mentioned above concern the case in which no restriction 
on the maximum number h of hops required by the communications among 
stations is imposed: a range assignment is feasible if it just guarantees a strong 
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connectivity of the network. When, instead, a fixed bound on the number h of 
hops is imposed, the computational complexity of the corresponding problem is 
unknown (from |4f2| we know only that the problem is NP-hard for spaces of 
dimension at least 2 and h = I7(n)). However, in 0, two tight bounds for the 
minimum power consumption required by n points arranged on a unit chain are 
given (notice that in this case, given any n, there is only one instance of the 
problem) . 

Theorem 1 (The Unit Chain Case [4j). Let N be a set of n eolinear points 
at unit distance. Then the order of magnitude of the overall power required by 
any optimal range assignment of diameter h for N is respectively: 

/ 2 ^+ 1 -1 \ 

— O f n 2 '‘-i ) ^ for any fixed positive integer h; 

~ ® (x) ’ °'’^y ^ = ■^(logn). 

Furthermore the two above (implicit) upper bounds are constructive. 



1.2 Our Results 

We investigate the 2d Min h R. Assign, problem for constant values of h (i.e. 
when h is independent from the number of stations). We first provide the fol- 
lowing general lower bound on the cost of optimal solutions for this problem. 

Theorem 2. For any set S of stations on the plane, let 5{S) be the minimum 
distance between any pair of different stations in S, and let opt;j(S') be the cost 
of an optimal range assignment. Then, it holds 

opt^is) = 

for any fixed positive integer h. 

The second result of this paper is an efficient method to derive a solution for 
any instance of our problem for fixed values of h. Given a set of stations S, let 
us define 

D{S) = max{d(si, Sj) | Si,Sj G S'}. 

Then, our method yields the following result. 

Theorem 3. For any set of stations S on the plane, it is possible to construct 
in time 0(/i|S|) a feasible range assignment rh{S) such that 

costimiS)) = 



for any fixed positive integer h. 
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The above bounds provide a fast evaluation of the order of magnitude of 
the power consumption required by (and sufficient to) any radio network on 
the plane. This may result useful in network design phase in order to efficiently 
select a good configuration. Indeed, instead of blindly trying and evaluating a 
huge number of tentative configurations, an easy application of our bounds could 
allow us to determine whether or not all such configurations are equivalent from 
the power consumption point of view. 

Let us now consider the instance of 2d Min h R. Assign, in which n 
stations are placed on a square grid of side y/n and the distance between adjacent 
pairs of stations is 1 (notice that this is the 2-dimensional version of the unit 
chain case studied in 0 - see Theorem QJ). 

Since our lower bound holds for any station set S, by combining Theorem 0 
and El we easily obtain that 

opt/j(G„) = 0 . (2) 

The square grid configuration is the most regular case of well-spread in- 
stances. In general, we say that a family S of well-spread instances is a family of 
instances S such that D{S) = 0(<5(S')a/|)^). Notice that the above property is 
rather natural: informally speaking, in a well-spread instance, any two stations 
must be not “too close” . This is the typical situation in most of radio network 
configurations adopted in practice 0. It turns out that the optimal bound in 
Eq. 0 holds for any family of well-spread instances. The following two corollaries 
are thus easy consequences of Theorems 0 and 0 

Corollary 1. Let S be a family of well-spread instances. For any S € S, it holds 
that 

opt^is) = 0 , 

for any positive integer constant h. 

Corollary 2. Let S be any family of well-spread instances. Then, for any posi- 
tive integer constant h, the 2d Min h R. Assign, problem restricted to S admits 
a polynomial-time approximation algorithm with constant performance ratio (i.e. 
the restriction is in APX/ 

2 Preliminaries 

Let S = {si, . . . , s„} be a set of n points (representing stations) in with the 
Euclidean distance d : TZf x TZf , where TZ'^ denotes the set of non negative 

reals. We define 



5{S) = min{d(sj, Sj) \ Si, Sj G S,i^ j} 



and 



D{S) = max{d(si, Sj) | Si,Sj G S'}. 
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A range assignment for S' is a function r : S TZ'^ . The cost cost(r) of r is 
defined as 



cost(r) = ^(r(s,))^. 

Observe that we have set the distance-power gradient /3 to 2 (see Eq. 
however our results can be easily extended to any constant /3 > 1. 

The communication graph of a range assignment r is the directed graph 
Gr{S,E) where (si,Sj) G E if and only if r(si) > d{si,Sj). We say that an 
assignment r for S is of diameter h {1 < h < n — 1) if the corresponding 
communication graph is strongly connected and has diameter h (in short, an 
h-assignment) . 

As defined in the Introduction, given a set S of n points in TZ? and a positive 
integer /i, the 2d Min h R. Assign, problem consists of finding an h-assignment 
‘I’ min for S of minimum cost. The cost of an optimal h-assignment for a given set 
of stations S is denoted as opt^(S'). 

In the proof of our results we will make use of the well-known Holder inequal- 
ity. We thus present it in the following convenient form. Let Xi, i = 1, . . . , h be 
a set of k non negative reals and let p,q G TZ such that p > 1 and g < 1. Then, 
it holds that: 




3 The Lower Bound 

Given a set S of stations and a “base” station b G S, we define opt;i(S', b) as the 
minimum cost of any range assignment ensuring that any station s G S can reach 
b in at most h hops. By the definition of the 2d Min h R. Assign, problem, 
it should be clear that the cost required by any instance S of this problem is 
at least opth{S,b), for any b G S. So, the main result of this section is an easy 
consequence of the following lemma. 

Lemma 1. Let S be any set of stations such that S{S) = 1. For every b G S 
and every positive constant integer h, it holds that 

opU(^,6) = f2(|^r+i/'‘). 

Proof. We first observe that, since S{S) = 1, for sufficiently large sets S (more 
precisely, for any S such that jS”! > 16), the maximum number of stations con- 
tained in a disk of radius R — y^| 5'|/3 is at most \S\/2. 
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Let be a range assignment that ensures that all the stations in S can 

reach h in at most h hops. We prove that cost(r^^*“*°“°"®) = by 

induction on h. 

For h = 1, consider the disk of radius R and centered in b. By the above obser- 
vation, there are at least \S\/2 stations at distance greater than R from b. The 
cost required by such stations to reach b in one hop is at least 

i\s\/2)R^ = n{\S\^). 

Let h >2, we define 



FAR={s e S I d{s,b) > R}. 

Clearly, we have that \FAR\ > |<5'|/2. Every station s in FAR must reach b in 
k < h hops, it thus follows that there exist k < h positive reals xi, . . . ,Xk (where 
Xi is the distance covered by the z-th hop of the communication from s to b) 
such that 



Xi + X2 + ■ ■ ■ + Xk > R- 

So, at least one index j exists for which xj > R/k > R/h. We can thus define 
the set of “bridge” stations 

B = {seS \ > R/h)}. 

Two cases may arise. 

Case \B\ > |S'|i. In this case, since \R\ = \/|S'|/3, 



ses 




Case \B\ < |S'|h. By means of the assignment every station in FAi? 

reaches in at most h — 1 hops some bridge station. Let B = {bi, . . . , b\B\}- 
So, we can partition the set FARUB into \B\ subsets Ai, . . . , A\b\ such that 
all the stations in Ai reach bi in at most h — 1 hopfl So, 



E / all — to— one 
Vh 



\B\ 

(s))^ > y^optft_i(A,,5i) 



= C 




l+T 



Notice that if a station reaches two or more bridge stations, we can put the station 
into any of the corresponding set Ads. We also assume that bi G Ai, for 1 < i < |R|. 



1 
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where the last bound is a consequence of the inductive hypothesis. Since 
\B\ 

Y,\M = \FARUB\>\S\/2, 

i=l 

the Holder inequality (see Eq. 0 implies that 



|S| 

i=l 



where the last equivalence is due to the condition \B\ < 

Proof of Theorem El 

For S{S) = 1, the theorem is an immediate consequence of Lemma[D The general 
case 6{S) > 0 can be reduced to the previous case by simply rescaling the 
instance by a factor of l/i5(S'). 

□ 




4 The Upper Bound 

Proof of Theorem El 

The proof consists of a recursive construction of an h-assignment rh{S) having 
cost 0(D(5')2|S'|1/'»). For h = 1, ri(S') assigns a range D{S) to each station in 
S. Thus, cost(n(5)) = Z3(S')2|S'i. 

Let us consider the smallest square Q that contains all points in S. Notice 
that the side Z of Q is at most D{S). Let us consider a grid that subdivides Q 
into subsquares of the same size l/k (the choice of k will be given later). 

Informally speaking, for every non empty subsquare we choose a “base” sta- 
tion and we give power sufficient to let it cover all the stations in S in one hop. 
Then, in every subsquare we complete the assignment by making any station 
able to reach the base station in /i — 1 hops. For this task we apply the recursive 
construction. 

The cost of rh{S) is thus bounded by 



cost(r/i(S')) < fc^L>(S')^ -1- ^cost(r/i_i(S'i)), 

i=l 

where Si is the set of the stations in the i-th subsquare. Since D{Si) = 0{D{S) /k) 
we apply the inductive hypothesis and we obtain 
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cost{rh{S)) = 



where the last equality follows from the Holder inequality (see Eq. EJ and from 
the fact that Yli=i = I'S'I- Now we choose 

k= 

in order to equate the additive terms in the last part of the above equation. By 
replacing this value in the equation we obtain 

cost(r^(5)) = . 

It is easy to verify that the partition of Q into subsquares and the rest 
of the computation in each inductive step can be done in time 0(|5'|). So, the 
overall time complexity is 0(/i|S'|). 

□ 









o 



o 






l/(/t-l)N 



5 Tight Bounds and Approximability 

Let us consider the simple instance G„ of 2 d Min h R. Assign, in which n 
stations are placed on a square grid of side s/n, and the distance between adjacent 
pairs of stations is 1. 

By Combining Theorems El and 0 we easily obtain that 
op^hiGn) = 0 . 

This also implies that the range assignment constructed in the proof of The- 
orem 0 yields a constant-factor approximation. 

It turns out that the above considerations can be extended to any “well- 
spread” instance. 

Definition 1. A family S of well-spread instances is a family of instances S 
such that D{S) = 0(,5(5')yi^). 

The following two corollaries are easy consequences of Theorems El and 0 
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Corollary 3. Let S be a family of well-spread instances. Then, For any S G S, 

opt,{s) = o , 



for any positive integer constant h. 



Corollary 4. Let S be any family of well-spread instances. 

Then, the 2d Min h R. Assign, problem restricted to S is in APX, for any 
positive integer constant h. 

6 Open Problems 

As discussed in the Introduction, finding bounds for the power consumption of 
general classes of radio networks might result very useful in network design. Thus 
it is interesting to derive new, tighter bounds for possibly a larger class of radio 
network configurations. 

Another question left open by this paper is whether the 2d Min h R. Assign, is 
NP-hard for constant h. We conjecture a positive answer even though we believe 
that any proof will depart significantly from those adopted for the unbounded 
cases (i.e. the Min R. Assign, problem - see m)- More precisely, all the 
reductions adopted in the unbounded cases start from the minimum vertex cover 
problem that seems to be very unsuitable for our problem. When the stations 
are arranged on a line (i.e. the 1-dimensional case), we instead conjecture that 
the problem is in P for any value of h. 
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